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ABSTRACT 

The decentralized control of stochastic large-scale systems is con- 
sidered. Particular emphasis is given to control strategies which 
utilize decentralized information and can be computed in a decen- 
tralized manner. 

The deterministic constrained optimization problem is generalized to 
the stochastic case when each decision variable depends on different 
information and the constraint is only required to be satisfied on 
the average. For problems with a particular structure, a hierarchical 
decomposition is obtained. 

For the stochastic control of dynamic systems with different infor^- 
mation sets, a new kind of optimality is proposed which exploits the 
coupled nature of the dynamic system. The subsys terns are assumed 
to be uncoupled and then certain constraints are required to be 
satisfied, either in a "off-line" or "on-line" fashion. For off- 
line coordination, a hierarchical approach of solving the problem is 
obtained. The lower level problems are all uncoupled. For on-line 
coordination, distinction is made between open loop feedback optimal 
coordination and closed loop optimal coordination. A hierarchical 
decomposition of the problem is possible in each case. The linear- 
quadratic-Gaussian problem is solved in detail for both off-line and 
on-line coordination. The resulting control strategies are found 
to have certain nice properties. 
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CHAPTER 1 


INTRODUCTION 

1. Nature of Large-Scale Systems 

It is hard to give a precise definition of a large-scale system, 
nor do we believe that there is one. A definition based on the number of 
components in the system is unsatisfactory since this would include systems 
which we would normally not consider large, e.g., a heated rod. Rather, 
the largeness of such systems seems to reflect the effort required to 
understand and control them. The following features, though not exhaustive, 
seen to be characteristic of most large-scale systems. 

(a) Large number of equations, usually coupled, describing 
the system. 

(b) Large number of decision variables to be manipulated. 

Usually these- decision variables can be collected 
into groups to be chosen by different agents according 
to their spatial configuration or their function. 

(c) The decision variables and state variables are so 
distributed that the information available to agents 

in charge of the different groups of decision variables- 
are different. This feature is usually absent in 
traditional small-scale control systems but is inevitable 
in large-scale systems. This kind of information pattern 
is sometimes termed nonclassical [W2] . 

(dl Presence of uncertainty. When uncertainty is absent, 
it would be able to exchange the total information 
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avai lab le among all the decision agents, thus rendering the 
information pattern classical. 

(e) More than one preference ordering for the entire system. 

Two cases are possible. The same set of preference 
orderings may be shared by all the decision agents or 
each decision agent may have a different one. The first 
case has been studied under the topic of nonscalar 
performance criterion, e.g., ref. [H3l • The second case 
generally arises in a game. 

(f) Difficulty in modelling the system. This can be illustrated 
by the effort spent in understanding systems such as a 
power system or the economic -system. 

These last two aspects are very important but they will not be 
considered in this thesis. Rather, we shall assume that a model of the 
system is known and there is one single preference ordering for the entire 
system which is represented by a cost functional (performance index) . All 
the control agents choose their controls to optimize (minimize) this cost 
functional. We feel that the problem of controlling a large-scale system 
is complicated enough even without the last two features. 

Two constraints which may be neglected in the control of small 
scale systems become extremely important when the system is large. 

(a) Communication. It may be expensive or even technically 
impossible to provide good communication links between all 
the control agents. 

(b) Computation. The sheer size of some typical large-scale 
systems, e.g., economic systems, may make the control 
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problem bigger than that can be handled by the fastest 
computers available. For the control of dynamic systems, we 
need actually to distinguish between two kinds of com- 
putation, off-line and on-line. Off-line computation is 
what can be computed before the system starts tunning, 
e.g., the computation of the optimal strategies. On-line 
computation has to be done in real time while the system 
is actually running,- e.g. , transforming the data received 
in. real time into decisions ( controls! using the optimal 
strategies computedof f-line . In general, on-line com- 
putation presents bigger problems than off-line computation 
since it has to be done in real time. 

Without these constraints , there would be little difference between 
the control of large-scale and small-scale systems. The information 
available to the control agents can be pooled together and the optimal 
control policy solved for like a small-scale problem. This policy can 
then be dispatched, to the control agents and implemented. The constraints 
on communication and computation make this approach of centralized control 
impossible. Some form of decentralization is therefore necessary. This 
is the central issue in the control of large-scale systems. 

Another advantage of decentralization which is related to communi- 
cation is reliability. A design based on centralization cannot function 
properly if the communication links between the central agency and the 
subsystems fail. On the other hand, decentralized control has the nice 
property that a certain degree of autonomy is retained for each subsystem. 
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Thus even though no signals are received from the central agency, some 
form of optimality is still possible. 
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2 . Historical Survey 

This design and control of large-scale systems has become a very 
popular research area in system and control theory. Previously, this 
problem was investigated mainly by economists and management scientists 
who have to deal with systems much larger than those encountered by 
engineers. The design of management information systems and decentraliza- 
tion by price mechanisms in organizations can all be regarded as methods 
of controlling large-scale systems [A5] . On the other hand, engineers 
do have some experience with large-scale systems, e.g., the power system 
which is more or less controlled in a hierarchical manner [Sl]_. 

Roughly speaking, past efforts on the control of large-scale systems 
can be summarized into- four categories. 

(1) Resource allocation processes . These deal with a special 
class of static systems called the economy. Given their 
initial resource endowments, their production possibilities 
and preferences, the economic units or participants of the 
economy are to choose their production and exchange 
activities such that a pareto-optimal point is reached. 

Let 

I = {l, ,n}: the set of economic units 

X: the commodity space 

0 : the identity element of X 

A 

= X x 2 = X for all i E I 
Y 1 = 1 x 1 x \ 2 for all i E I 
Y - Y 1 x x Y n 
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For any d* e *x\ z* £ *X^, the pair (d^z 1 ) = s 1 e Y 1 is 
called an economic plan of the ith unit. 
d 1 : exchange activities 

z 1 : production activities 

, 1 i n. 

s - (s s ,...., s ): program 

The ith component of the economic environment is defined 

as the triple 

, i. i i.,, i 

( A,w , R) - e 
o 

where 

^A is a non-empty subset of Y; the set of i-achievable 
programs . 

w * is an element of X; the initial resource endowment 
o 

of the ith unit. 

^R is a total ordering of the elements of *A, i.e. , 

is a trains itive , reflexive , connected relation defined 
i. 

on A. 

The economic environment is then defined as 
, 1 i n . 

6 55 \ 0 | ■ • • • • f6 f • • • •« |G ) 

Given an economic environment, as adjustment process is 
a set of rules for exchanging information among the economic 
units, regarding their components of the environment, in order 
to reach an agreement about the economic program to be imple- 
mented. Formally, an adjustment process tt is defined as 


tt - (L, f, <p) 
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L is the set of messages that the units can use 
for exchanging information. 

f = (f\ f 1 ) is- the n "response" functions. 

f 1 : L n x E -*■ L where E is the class of environments. 

4> is the outcome rule which associates with each 
equilibrium message an outcome set S of economic 
programs . 

Informational decentralization requires that the response 
function of each unit depends on the environment e only through 
its own components. Hurwicz [H4] and Camacho [C6] have 
presented different informationally decentralized adjustment 
processes whose equilibria are Pareto-optimal for different - 
economic environments. It should be. noted that this class 
of problems involve static , deterministic systems of a very 
special nature. However, the decision making is also decen- 
tralized since the economic units do not have to get together 
to choose their strategies . 

(2) Team decision problems . This class of problems is first 
proposed by Marschak [M3]. There is a single objective 
function and a number of decision makers each with different 
information on the state of the system. Optimal decision 
rules transforming the information into action are required. 

The scheme is informationally decentralized but the decision 
makers have to find their decision rules together. 


where 

(a) 

(b) 

(c) 
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The linear-quadratic-Gaussian static case has been con- 
sidered by Radner [Rl]. It is found that the optimal decision 
rules axe linear. Reference [M5] contains most of the original 
work done. For the dynamic case, when one decision maker's 
information depends on the action of another decision maker, 
the situation is more complicated. We are used to the case 
when the decision maker's information includes that of all the 
decision makers who act before him. Under those circumstances, 
the optimal decision rules are linear and are in fact given by 
the "Separation Theorem" [A 2, M2, W4] . However, Witsenhausen 
showed by a counter-example that the optimal control strategies 
need not be linear, [W3] , contrary to the solution of ordinary 
linear-quadratic-Gaussian problems. He also studied when the 
Separation Theorem holds for problems with non-classical 

information pattern [W2] . Ho and Chu [H3] gave conditions 

'! 

on the information structure such that the optimal decision 

!l 

rules are linear. Chong and Athans [C3] showed that the ad- 

j 

vantage of decentralized information in team decision problems 
may be offset by the additional complicated computation 
required to find the optimal strategies. Aoki [Al] studied a 
dynamic team when the decision agents involved are allowed to 
share information about their past control values. 

(3) Hierarchical systems . The decision agents controlling the 
system are arranged in a hierarchy of levels. Each agent in 
a level communicates with several agents in the level under it 
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and with one agent in the level above. The agents at different 
levels perform different functions. The agents at the lowest 
level actually interact with the system under control while 
those above act as information processing centers or make 
long term decisions. Although there are a lot of intuitive 
advantages of having a hierarchical system [S2] , such as 
reliability, adaptibility and ease of computation, very little 
is available in the form of a mathematical theory [Ml] . Most 
of the work done in the hierarchical decomposition of systems 
has been inspired by mathematical programming. One version is 
the following. The subsystems in the large system are con- 
trolled as isolated units. This would be unsatisfactory since 
the subsystems are actually coupled to each other. The 
coupling effect is taken care of by a coordinator who sends 
out coordinating signals to the lower level controllers. The 
coordinating signals are so chosen such that the overall 
objective of the system is achieved. The original control 
problem is thus divided into two levels. The lower level 
problem consists of independent optimization problems dependent 
on the coordinating signal. The higher level problem is that 
of the coordinator. Much of the work done in the decomposition 
of mathematical programming can be found in [L3] and [W5] . It 
should also be noted that some of this work is actually 
related to resource allocation processes. In general, 
hierarchical decomposition methods motivated by mathematical 
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programming deal mainly with the computational aspects of the 
problem* The presence of uncertainty and the flow of 
information between subsystems is seldom treated. 

(4) Controllability and stabilizability of decentralized dynamic 
systems . Although this work is not primarily concerned with 
optimization, it addresses itself to some very fundamental 
questions in large-scale systems , namely the controllability 
of decentralized systems and their feedback stabilizability. 
Preliminary work was done by McFadden [M4] who considered a 
system that arises in modelling certain aspects of economic 
systems where several national agencies exercise regulatory 
control power over different aspects of economic activities. 
Aoki [A4] considers the stabilizability of decentralized 
linear time- invariant dynamic systems with coordination and/or 
communication among control agents. It is found that 
controllability of the systems no longer implies stabilizabil- 
ity and the control agents must in general communicate with 

/ 

each other in order to stabilize the system by feedback. 
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3. Motivation 

From the previous section we can see that most of the work done 
on large-scale systems treat the two important issues of computation 
and information separately. Work dealing with the computational aspect 
of the optimization of large-scale systems is almost exclusively de- 
terministic. The flow of information within the system is therefore 
unimportant. On the other hand, work on decentralized information 
structure seldom considers the importance of computational requirements. 
Some of the optimal solutions to dynamic teams are computationally not 
feasible. Since in the actual control of large-scale systems, com- 
putational considerations are as important as those of information, 
decentralized information may not be as efficient as it may appear. 

Decentralized information structure almost inevitably gives 
rise to more complicated decision rules than centralized information 
structure. This may be explained as follows. Since each decision 
agent has only partial a posteriori information about the state of the 
system, he may want to generate the missing information using the 
common a priori information available to him. Mathematically, he is 
required to extract whatever information that is available in order 
to be optimal. 

This motivates us to use a broader interpretation about de- 
centralized information. We shall consider two kinds of informations 
a priori information and a posteriori information . A priori information 
consists of structural information and performance indices . A 


posteriori information consists of measurements on the system 
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Thus only a posteriori information may be decentralized, or both 
a priori and a posteriori information may be decentralized. If only 
a posteriori information is decentralized, then the common a priori 
information may induce each decision agent to use very complicated 
decision rules. If appropriate decentralization is chosen for both 
a priori and a posteriori information, the decision rules for the 
decision agents will be simple. There will be a severe loss of 
optimality, however, since the individual agents do not know that they 
are controlling the same system. 

To compensate for the loss in optimality due to the decentra- 
lized information structure (both a priori and a posteriori) of the 
decision agents, we introduce a higher level coordinator who possesses 
all the a priori information . The coordinator may have a posteriori 
information about the system but in general this information is less 
detailed than that of the decision agents. The duty of the coordinator 
is to transmit coordinating parameters to the individual decision 
agents such that the system is coordinated in some sense. 

We shall thus consider systems with a multilevel information 
structure. The higher level coordinator has all the a priori in- 
formation and some a posteriori information. The lower level decision 
agents have decentralized a priori information as well as decentralized 
a posteriori information. They also receive certain coordinating 
parameters from the coordinator. A structure with many levels can be 
investigated, although in this thesis, only the two-level structure 


will be considered. 
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4. Structure of this Thesis 

This thesis is structured in the following manner. 

In Chapter 2 the decomposition for a static stochastic 
optimization problem is considered. The problem under consideration 
consists of several decision agents each having different and noisy 
information on the state of the system. There is also a coordinator 
who sees that certain constraints are satisfied with respect to his 
own information . Results in mathematical programming are used to 
obtain a hierarchical decomposition for this stochastic problem. It 
is shown that with the coordinating parameter transmitted from the 
coordinator, the lower level problems of the decision agents can be 
solved in a decentralized manner. 

In. Chapter 3, the concept of decentralized a priori infor- 
mation is used to obtain an off-line decomposition for nonlinear 
stochastic dynamic systems. The lower level controllers assume that 
they are controlling uncoupled dynamic systems with their decentralized 
a posteriori information . The coordinator has all the common a priori 
information and insures that certain constraints cure satisfied. This 
is reformulated into a mathematical programming problem. A hier- 
archical scheme of finding the optimal strategies is then obtained. 

In Chapter 4, the approach suggested in Chapter 3 is used 
to find an off-line decomposition for the linear-quadratic-Gaussian 
problem. Both the lower level problems and the higher level prob- 
lem can be solved explicitly. The optimal local control strategy 
for the i^* 1 controller is found to consist of two parts: a closed 
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loop part depending on the difference of his a posteriori local state 
estimate and his a priori local state estimate and an open loop part 
depending on the coordinating parameters transmitted by the coordinator. 

In Chapter 5, the on-line decomposition of stochastic dynamic 
systems when the coordinator collects measurements from the lower 
level controllers and sends out coordinating parameters periodically 
is considered. Both open loop feedback optimal coordination and closed 
loop optimal coordination is discussed. For open loop feedback optimal 
coordination, the results in Chapters 3 and 4 are used to treat the 
nonlinear and linear-quadratic-Gaussian cases. For closed loop opti- 
mal coordination a functional equation which has to be solved is 
derived. With the help of the solution of a special dynamic team we 

arrive at explicit solutions for the linear-quadratic-Gaussian case. 

\ 

This is compared -with the corresponding solution from open loop feed- 
back optimal coordination. 

In Chapter 6, we review the philosophy of this thesis and 
summarize the results obtained. Suggestions for future research are 
also given. 
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5. Contribution of this Thesis 

The main contribution of this Thesis is the simultaneous 
treatment of the issues of computation and information in the control 
of large-scale systems. The concept of decentralized information is 
extended to include decentralized a priori information as well as de- 
centralized a posteriori information. Thus decentralized control 
schemes, both computational and informational, are obtained. To com- 
pensate for the loss in optimality, we introduce an extra coordinator 
who has the common a priori information and some a posteriori in- 
formation and influences the lower level through coordinating para- 
meters. This is a new approach to the control of large-scale systems. 

For static systems, stochastic optimization problems which 
have both the features of team decision problems and resource allo- 
cation problems are considered. Again this approach considers de- 
centralized computation and information simultaneously. 

For dynamic systems, the distinction between the two types of 
periodic coordination is new. Specialization to the linear -quadratic- 
Gaussian case gives results which are intuitively attractive. 



CHAPTER 2 


DECOMPOSITION FOR A STATIC STOCHASTIC OPTIMIZATION PROBLEM 
1. Introduction 

In this chapter we consider the stochastic optimization problem 
of a static system consisting of several subsystems. Each subsystem has 
a decision agent which has noisy information on the state of the system. 
The overall objective of the system is the sum of individual objectives 
of the subsystems. The subsystems are uncoupled except for constraints/ 
which couple them together. Contrary to the deterministic case/ the 
constraints do not have to be satisfied exactly. Rather, the problem 
solver only requires the constraints to be satisfied on the average. We 
have thus a constrained stochastic optimization problem with several 
decision agents each having noisy and different information on the state. 
The many decision agent aspect of the problem has been considered under 
the heading of team theory [ri] . For a constrained deterministic problem 
with the special structure described above, a hierarchical decomposition 
has been obtained using mathematical programming [LI] . We shall con- 
sider the two aspects of the problem simultaneously and obtain a hier- 
archical decomposition. This static problem is not only interesting for 
its own sake but is also useful for the decomposition of dynamic systems. 

In the next section we present an example to motivate the general 
problem that we will study in this chapter. In Section 3 we review some 
results in non-linear programming; these can be used to obtain the decom- 
position of a static optimization problem when the state of the system 
is observed exactly. In Section 4 the stochastic optimization problem 
is formulated for the case when the state of the system is not known 
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exactly. In Section 5 the decomposition of the stochastic problem is 
investigated. Conditions under which the decomposition is well-posed 
are given and related to the information structure of the system. In 
Section 6 these results are stated in terms of measurement functions. 

The stochastic version of the example is solved in Section 7 and compared 
with the deterministic solution. 
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An Example 
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Consider a manufacturing company with N divisions, each producing 
a set of different commodities using the same resources. The ith division 
produces in units of goods from A^in units of raw material at a cost 
of t^R.u^ where is assumed to be a positive definite matrix. 

The market price of G^ is 2 tk and the total resources available 

are v. 

Given any price vector 2 tt^ and production u^, the profit function 
of the ith division is 


- f. 


<w 


* 2u!tt. - u!R.u. 


( 2 . 2 . 1 ) 


The total profit of the company is the sum of the profits of all the 
divisions, i.e.. 


-f (u,tt) 


N 


■ - 1 f i<iii 'V 
i=l 1 x 


( 2 . 2 . 2 ) 


The objective of the company is to minimize the total loss (maximize the 
total profit) subject to the constraint that the total resources used are 
less than the total resources available. The problem is thus 


N 


Problem 2.1: 


Minimize T u.'R, u', - 2 u ! JL 

i=l - 1 - 1 - 1 


U ^ , • . • .U N 


N 


y A . u . - v < 0 

i-1 - 1 - 1 — 


(2.2.3) 


(2.2.4) 


Remark : We could have imposed the additional constraint that u . >_ £ 

1 r 

i=l» , N but for simplicity we have assumed implicitly that the u^' s 

would turn out to be non-negative when Problem 2.1 is solved. 
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In this example the state of the system consists of the price 

vector i=l,...., N, the resource vector v and possibly the cost 

matrices R. and the resource utilization matrices A. * The decisions 
— i — i 

to be chosen are u^ , i=l,...., N. Calling the state as x we have the 
following general problem 

N 

Problem 2.2: Minimize Y f . (u. .x) 

r . 1 i 

i=l 

N 

Subject to £ g.(u.,x) - g (x) £ £ (2.2.5) 

i=l 

For our example 


f.( V K) 

= U, *R .U . “ 2u. , 'TT. 
—1—1—1 -1-1 

(2.2.6) 

g i (u ± .x) 

» A. u. 
—1 —a 

(2.2.7) 

g o (x) 

rr V 

(2.2.8) 


There are situations when the state of the system cannot be observed 

exactly, but is described probabilistically. Suppose now that is 

measured by the ith division manager as 

z, « a 7T. + 0 . i=l, N (2.2.9) 

— i -4. —i — x 

is measured by the resource manager as 

z o " v + <9 0 (2.2.10) 

0^ , i=l,.... N, v and 6^ are random vectors independent of each 
other and having the normal distributions (assumed known) 

E{jLi } = tt ' ; Varfr^} = n ± (2.2.11) 



»••••( 


N 


( 2 . 2 . 12 ) 


E{6^i } = 0 

E{ v } = v 
EfSjQ } = 0 
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Varte^ > « & L . i=l 

Var{ v > = V 
Var{0Q } = Qq 


(2.2.13) 

(2.2.14) 


All the information available are contained in the measurements z^ , 

i=0 ,...., N. The production of each division has to be based on his 

measurement and some other signal based on Zq. 

The objective of the company is to minimize the expected total loss. 

As for the resource constraint (2.2.4) it can no longer be satisfied 

exactly since v is not measured exactly. Instead, we require the total 

resources used to be less than the total resources available given the 

measurement z„ , i.e. 

-0 



(2.2.15) 


The production of each division has to use some information contained 
in _Zq because the resource constraint (2.2.15) has to be satisfied. We 
thus have the following problem, 


Problem 2.1A : 
subject to 


IN 

Minimize E < J u^' ii 
li=l 


2u .'ir. > 



(2.2.16) 


(2.2.17) 


Ei = Hi <£i ; £o > 


i=l, N (2.2.18) 


Remark : u^ at most can depend on all the information contained in z^ ,js Q . 

We shall show later that the optimal decision function in some cases can 
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be found in a hierarchical manner and operation of the company can be 
decentralized. 
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3. Decomposition of a Non-Linear Programming Problem 

In this section, we present some results in non-linear programming. 
These give rise immediately to a decomposition method for deterministic 
problems. Later on they will be used to obtain a decomposition for the 
stochastic case. 

Consider the mathematical programming problem. 

Problem 2»3 : Minimize f (u^ v^) 

Subject to g(u^,... ., u^) £ e X? (2.3.1) 

u ± e V i i=l N 

where 

f(u x , u^) = f^) + + (2.3.2) 

g ( Ul ty = g 1 (u 1 ) + + g N (u N ) - g Q (2.3.3) 

Except for the coupling constraint (2.3.1) , the problem is essentially 
uncoupled. The constraint may be interpreted as the common resource avail- 
able to all the decision makers. This structure has been exploited to give 
a hierarchical decomposition scheme for the solution of the problem using 
results in mathematical programming. We state one sufficient condition 
which makes this possible. 

Theorem 2.3.1 (Saddle-point condition) : Let f be a real-valued function 

defined on a subset C of a linear space U. Let g be a mapping from C into 
the Euclidean space R^. Suppose there exists a £* e Rp , £* 0^ and a u* e C 

such that the Lagrangian L(u,£) = f(u) + £*g(u) possesses a saddle-point at 
u*,£*, i.e.. 
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L(u*,£) L(u*,£*) L(u*,£*) (2.3.4) 

for all u e C , £ ^ () then u* solves 
minimize f(u) 

Subject to g(u) _< 0. u E C (2.3.5) 

The proof of this theorem is elementary [L2] . Note that there are 
no conditions on the convexity or differentiability of f or g. For 
equality constraints , the same result holds except that £ is no longer 
required to be non-negative. The following theorem is due to Lasdon [LI] . 
Theorem 2.3.2 ; Suppose there exists a saddlepoint for the Lagrangian 
corresponding to Problem 2.3, then the following hierarchical scheme can 
be vised to obtain a solution, provided the minimizing problem is well- 
posed.* 

<v 

Lower level: Minimize L. (u. ,p) = f. (u.) + p'g.(u.) 

xx fc xx*- xx 

Subject to e IK 

i=l , . . . . , N (2.3.6) 

N 

V *V I 

Higher level : Maximize l L . * (£) - £ g 

i=l 1 0 

Subject to £^£ (2.3*7) 

where I^*(£) is the minimum obtained in equation (2.3.6). 


*For some £, the lower level problem may not have a solution. We thus have 
to limit £ to the set D = {£| the lower level problem has a solution }. 
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Proof: We need the fact that the constrained saddle-point for L(a,b), 

a e A, b £ B exists if and only if [yl] 


Min Max L(a,b) = Max Min L (a,b) (2.3.8) 

a £ A b £ B b E B a £ A 


The value of the saddle-point is also equal to either side of equation 
(2.3.8). Given any p we note that the minimization part on the right 
side of equation (2.3.8) can be split up into N minimization problems 
independent of each other. Specif ically , we have 


Max Min L(u,£) = Max 
£ ^ 0_ u £ >/£ 


! N N 

I W + I 

i=l 11 i=l 1 


Min 

u 



= Max 
£ > 0 



Min (f. (u.) + E'g/(u. )'}-£' g. 
u 1 1 11 u 

i (2.3.9) 


Equations (2.3.6) and (2.3.7) are obtained by making the appropriate 

identifications. Q.E.D. 

Theorem 2.3.2 suggests a way of finding the optimal £* and u* simul- 

taneously. This requires giving L^* (jo) as a function of £. There are 

numerical methods [LI] by which the optimal solution is found recursively 

by choosing a new p . depending on the result of optimizing the dual 
N “ t+1 

function £ L. (u. , p ) . However, we are more interested in the structure 
i=l 1 1 Z 

of the decomposition, i.e., once an optimal £* is found, the lower level 
problems are uncoupled. The significance of this is more obvious when we 
look at the parametric case given by Problem 2.2. For each x we have a 
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math ematical programming problem; x may be regarded as the state of 
the system which is known exactly. If we use the result of Theorem 2.3.2, - 
the optimal jd* would be a function of x, i.e., p*(x). With this optimal 
jd*(x) , the lower level problems would be 

Minimize L i (u^E* (x) ,x) = fAu^,x) + £* (x)g^(u^,x) 

u € th i=l N (2.3.11) 

Thus we can regard the higher level and lower level decision makers 
as both making observations on the system. The higher level decision maker 
(coordinator) observes the state x, chooses the coordinating parameter £*(x) 
and transmits it to the lower level. The lower level decision makers then 
use this, together with f^ and g^ and x to choose their optimal decisions. 
This is displayed in fig. 2.1. 

Applying this result to the example given in Problem 2.1 we have the 
following decomposition: 


Lower level (Division manager): 

Minimize u!R.u. - 2u!tt. + p’A.u. 

— x— l— 1 —x—i —1—3. 


(2.3.12) 


u. 


l~*l , ... N 


Denote the optimal of (2.3.12) by L^*(jd) 


Higher level (Resource manager) : 

N 

Max J L. * (jd) - £* v 
i=l 1 

£ 1 2 . 


(2.3.13) 
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From these equations we obtain the following optimal u.*,i=l,.... N 


and £* 


Hi* = Si ' 1 Ci - 


(2.3.14) 


R* = Arg Max - (l AJ^ A^)r + r' ([ AJR. X - v) 

. - i=l i=l 

R 9. 

N 

- (I li*± 1 ± ) (2.3.15) 

i=l 


Referring to equation (2.3.12) we see that the loss function of the 
ith division manager has been modified by the addition of an extra term 
which reflects the cost of resources. jd is the price of the resources 
while denotes the amount used. 

In this deterministic case, the lower level decision makers base 
their decisions on 7f^ while the higher level bases his decision on 7T^ 
and v. There is some decentralization of information , but the higher 
level in fact needs more information than the lower level . In the general 
deterministic case, both levels need the same information x, which is 
not too satisfactory. This leads us to study the stochastic case when 


information can also be decentralized 
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4. Formulation of the Stochastic Problem 

We now consider the case when the state x is not known exactly by 
the different decision makers. However, there is a probability description 
on the state space X given by the triplet (X,B,y) . B is a a- algebra on 
X # and y is a probability measure. 

Let F^,i=l,...., N be sub-O-algebras of B. F^ represents the infor- 
mation available to the ith decision maker. Since the state x is not 
observed exactly, u^ will be required to be generated by a function y ^ 
measurable with respect to F^. This is equivalent to the existence of a 

measurement function h. on x such that u. depends on the measurement 

i i 

z = h (x) [Hi] . Denote by I\ the set of admissible decision functions y. 
i i i 1 

measurable with respect to Then y A (X]/ • • • /Y N > e x...x T N A T. 

Given any decision function y, f(y(x) ,x) would be a random variable. As 
in the case of team decision problems yis chosen to minimize the expected 
payoff E{f (y(x) ,x) }. 

For the constraint several alternative formulations are possible. 

1. g(y (x) ,x) £ a.e. (2.4.1) 

As would be expected, it is rather difficult to satisfy this con- 
straint. 

2. Prob {g(y(x) ,x) 0} >. b (2.4.2) 

where b is some given probability. 

Particular cases of this problem have been studied under the heading 
of chance constrained programming [Cl]. it is the situation where the 


constraint is only required to be satisfied with a given probability. 
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3. E{g(Y(x) ,x) |F Q } < 0 a.e. (2.4.3) 

where F Q is some sub-^-field of 8. F^ specifies the degree of exactness 
with which the constraint has to be satisfied or in other words the 
information of a coordinator who sees that the constraint is satisfied. 

Two extreme cases are possible: 

a. F q * {$,X> (2.4.4) 

This corresponds to no measurements for the coordinator. Then 

E{g(y(x) ,x) } < £ (2.4.5) 

b. F Q * 8 (2.4.6) 

This corresponds to measuring the state almost exactly. Then 

g(Y(x),x) a.e. (2.4.7) 

With the introduction of the constraint, the information available 

to the decision makers may not be sufficient to insure that the constraint 

is satisfied. In general some extra information has to be communicated 

from the coordinator to the decision makers . 

We will investigate what this information should be. Let IV ^ I\ 

be the new admissible functions. T! is set of functions measurable with 

i 

respect to F^ fl F^. Thus we have formulated the following stochastic 
optimization problem. 

Problem 2.4 : Minimize E{f(Y(x) ,x) } 

Subject to E{g(Y(x) ,x) |F^} £ 

Y = • • • • »Yjj) E r^x xT^ 


a.e 
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f(y(x),x) = + - .-+f N (Y N (x) »x) 

g(Y(x) ,x) = g (Y. (x) ,x) + . . .+g (Y„(x) ,x) - g (x) (2.4.8) 

Remark : IV is the set of decision functions which use both the information 
of the ith decision maker as well as the information of the coordinator. 

We shall show later that not all the information of the coordinator is 
needed by the ith decision maker to choose his best decision. Under 
certain conditions, the information of the coordinator can be compressed 
into a signal which will be sufficient for the ith decision maker. 
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5. Decomposition of the Stochastic Problem 

The special form of the constraint allows us to transform Problem 2.4 
into a simpler form for which the results of section 3 are applicable. 

Lemma 2.5.1 : Let f(y(x),x) be a random function from T’x X into the reals, 

where P is a set of functions on X measurable with respect to F 0 F . 

F C B and F Q C B. T is the set of functions measurable with respect to F. 

Let M= {y|E{g(y(x) ,x) |F q } <_ £ a.e.} 

Suppose Min E{f (y(x;y) ,x) | F Q } (y) exists a.e. and is equal 

yC;y) e tHm 

to E{f (y*(x;y) ,x) |F 0 )(y) , then 

Min E{f(y(x) ,x)} = E{f (y*(x;x) ,x) } 

Y e T = e{ Min E{f (y(x;y) ,x) |F Q }(y) > 

y(»;y) £ ITIM (2.5.1) 

Proof : For y(*) £ T'CIm y(*;y) £ TOm 

E{f (y(x;y) ,x) |F q } (y) = E{f (y(x) ,x) |F Q }(y) (2.5.2) 

For a proof of this see Appendix A. 


Thus 

Min E{f (y(xjy) ,x) |F Q } (y) * E{f (y* (x;y) ,x) | F Q } (y) 

YC'-y) e r ^ M £ E{f (y(x) ,x) |F 0 >(y) a.e. for all 

y £ Y’flM (2.5.3) 


Taking the unconditional expectation and minimizing over T ' D M we have , 
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E{ Min E{f (y(x;y) ,x) |F Q } (y) } £ Min E{f(y(x),x)} (2.5.4) 

y(*;y) eTHm y e pOm 

On the other hand 

e{ Min E{f (Y(x;y) ,x) |F Q }(y) } = E{f (y*(x) ,x) } £ Min E{f (y(x) ,x) > 
Y(*;y) e rflM y e POm (2.5.5) 

From equations (2. 5.4) and (2.5.5) we obtain equation (2.5.1). Q.E.D. 

Using Lemma 2.5.1, Problem 2.4 can be solved by considering the 
following problem. 

Problem 2.5 : Minimize E{f (y(x;y) ,x) | F^} (y) a.e. 

Subject to E{g(y(x;y) ,x) |Fg}(y) £ 0. a.e. (2.5.6) 

Y(* ;y) £ F ^x .... xT^ 

If F^ is such that the conditional probability measure P y °(A) is 
regular, i.e. it is a probability measure given any y, then Problem 2.5 
can be transformed to the following form. 


Problem 2.6: 

Minimize f(Y*y) 



Subject to g(y?y) £ £ 



Y(*;y) £ T 

(2.5.7) 

A 

where f(Y;y) 

= E{f (y(x;y) ,x) | F q } ( y) = / f (y(x;y) ,x) dP y ° (x) 

(2.5.8) 

g(Y;y) 

= E{g(y(x;y) ,x) | F Q >(y) = / f (y(x;y) ,x) dP y ° (x) 

(2.5.9) 


Remark : The conditional probability measure is regular if it is generated 

by an observation function tDl] . 

Problem 2.6 is a conventional functional minimization problem given 
any y. The results in Theorems 2.3.1 and 2.3.2 do not depend on the finite 
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dimensionality of u, thus a hierarchical decomposition is obtained if 
a saddle-point exists for Problem 2.5. This is summarized in the following 
theorem. 

Theorem 2.5.2 : Suppose there exists a saddle-point (y*(* ;y) ,p*(y) ) for 

the Lagrangian associated with Problem 2.5. Then Problem 2.5 can be 
solved by the following hierarchical decomposition. 

Lower level ; 


Minimize L^y^* ;y) ,p(y) ,y) = E{f ± (y i (x;y) ,x) + £* (y)g i (Y i (x;y) ,x) iF^fy) 

i=l, , N (2.5.10) 


y i (*;y) e I\ 


Higher level : 


Maximize 



* (£(y) »y) “ e{£' (y) g Q (x) I F 0 > (y) 


£(y) >, 0 (2.5.11) 

where L^*(£(y),y) is the minimum obtained in equation (2.5.10). 

Proof : By using Theorem 2.3.2 on Problem 2.6, the decomposition is 

obtained. 

Corresponding to Problem 2.4 we have the following decomposition. 
Higher level : Choose £*(y) measurable with respect to F^. 

Lower level : 

Minimize L (y. (*;y) £*(y) ,y) = E{f (y (x;y) ,x) + £*' (y)g. (y. (x;y) ,x) | F } (y) 

11 ; IX XX u 

y ± (*;y) £ I\ i=l N 

Note the optimal y^* can be expressed in the form y^*(x,p*(x)) . 


(2.5.12) 
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The optimization problem of each lower level decision maker is 
described by equation (2.5.12) . A conditional expectation has to be 
optimized by each. This optimization is not always well-defined with 
the information available to the ith decision maker . We give a necessary 
and sufficient condition when this is defined. 

Theorem 2.5.3 ; Let G^ be the smallest O - algebra of F^ with respect 

to which E{f^(y^(x;y) ,x) + £' (y)g^(Y^(x;y) ,x) | F^} is measurable. Then 

given £(y)» L ^(Y\(» ?y) ,£(y) ,y) can be optimized by the ith decision maker 

if and only if G.CF.. 

11 

Proof; For any measurable function £(x) , if e{£|Fq} is measurable with 
respect to G^, G^CF^, ^ en E ^l^o^ = ( see Appendix A). If 

G. CF. , then 

i i 

E{fi(Yi(x ; y) »*) + £' (yJg^Y^xfy) ,xj |F Q } 

= E{f i (Y i (x;y) ,x) + £* (yJg^Y^xjy) ,x) | G^^} 

= Etetf^Y^xjy) ,x) + £' (yJg^Y^xjy) ,x) |F i > | G i > (2.5.13) 

The inner expectation can be evaluated by the ith agent and minimized with 
respect toY i (*;y) £ I\, hence minimizing l^y^ 4 ;y) ,£(y) ,y) • if G i <£F i , 
then E{f i (Y i (x;y) ,x) + £' (y)g i (Y i (x;y) ,x) | G A > cannot be evaluated given 
the information contained in F^, and thus it cannot be minimized. Q.E.D. 

represents the minimal sufficient information required by the 
ith agent to solve the decomposed decision problem given only £(y) . If 
this information is not available, then the coordinator has to supply 
something else besides £(y) . Typically this would be P^ (A> , the 
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conditional probability measure with respect to G^. Note that although 
Fq CZ F ± satisfies the condition in Theorem 2.5.3, it is not always 
necessary for the ith agent to have more information than the coordinator. 
This will be illustrated in the next section. 
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6. Reformulation in Terms of Measurement Functions 

In order to gain more insight, we shall reformulate the problem 
in terms of probability densities and measurement functions. The 
information requirements for the hierarchical decomposition can 
then be seen more easily. 

Let x be the state of the system, x includes noises as well. 

z^ = h^(x) be the measurement of the ith agent; e Z^ 

Zq = hg (x) be the measurement of the coordinator (specifying 
the constraint); z Q e Z^ 

Then F^, i=l,...., N is the O - field on X generated by h^ and y^ is 
measurable with respect to F^ if y^ = n^oh^ where is Borel-measurable 
on Z^. 

Corresponding to Problem 2.4 we have 

Problem 2.7 : Minimize E{f (i|(z) ,x) } 

Subject to E{g(ri(z) ,x) |z Q } £ £ 

n(z) = (z 1 ,z 0 ) VVV* 

f (t) (z) ,x) a f^ (l)^ (z^ ; Z q),x) + . • . . + fjj ( z jj# Z Q ) ,x) 

g(n(z),x) « g 1 (r) 1 (z 1 ;z Q ) ,x) + + g N * Z N '* Z 0 * ' X * 

- g Q (x) (2.6.1) 

Corresponding to Problem 2.5, we have 

Problem 2.8 i Minimize E{f(ri (z) ,x) |z Q } 

Subject to E{g(T)(z) ,x) |z Q } £ £ 


with r) , f and g given as in equation (2.6.1) 


( 2 . 6 . 2 ) 
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Theorem 2.5.2 then becomes 

Theorem 2.6.1 : Suppose there exists a saddle-point (r|*(* ;z Q ) ,p*(z Q ) ) 

for the Lagrangian associated with Problem 2.8, then Problem 2.8 can be 
solved by the following hierarchical decomposition. 


Lower level : 
Minimize 


»P( Z Q ) = E{f i (n i (z i ;z 0 ) ,x) + £' (z 0 )g i (n i (z i jz Q ) ,x) |z 0 } 


i-1, N (2.6.3) 

Higher level : 

N ^ 

Maximize J (p(z Q ) ,z Q ) - e{£' ( z Q )g 0 (x) |z Q } 

Subject to £,(Zq) 2. (2.6.4) 

L^* (p(Zq) ,Zq) is the minimum obtained in equation (2.6.3). 

Remark : From equation (2.6.3) we conclude that ti^*(z^;Zq) = r^* (z^;£* (z^) ) , 

i.e. , all the relevant information about the constraint is contained in 
£*(Zg) if the lower level problem is well defined. 

The hierarchical decomposition scheme for Problem 2.7 then consists 
of the following. 


Higher level : Coordinator makes a measurement z^, computes the coordinating 

parameter £* (z Q ) and sends it to the lower level. 

Lower level : ith decision agent makes a measurement z^, and uses this 

together with £*(z Q ) to compute the best decision function Tl^* (z^;£* (z Q ) ) . 

The structure of the decomposition is displayed in Figure 2.2. Note 
that the decomposition is in real-time since no iterations are involved. 
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Fig. 2.2 Structure of Decomposition (Stochastic) 
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Because of the static nature of the problem, the information flow between the 
coordinator and lower level decision makers is only one-way. 

An alternative condition for Theorem 2.5.3 is the following. 

>\j 

Theorem 2.6.2 : ,p(z Q ) >z Q ) can be optimized by the ith decision 

maker if and only if 

E{f i (n i (z i ;z () ) ,x) + £' (Zg)g^ (n i (z^#z Q ) ,x) Iz^Zg} 

■ E {f i (n i (z i ;z Q ) ,x) + £' (z 0 )g i (n i (z i ?z 0 ) ,x) | z ± } (2.6.5) 

Proof ; By the nested property of the conditional expectation [L4] , 

Vv^V'^oW 

= E{E{f i (Tl i (z i ;z 0 ) ,x) + £’ (z 0 )g i (n i (z i ;z 0 > ,x) |z i ,z () }|z 0 } (2.6.6) 

If the inner conditional expectation is equal to the right side of 
equation (2.6.5) , then it can be minimized with respect to r)^(*;z^) . 

If equation (2.6.5) does not hold, then L i (ri i (. ?z Q ) ,p(z Q ) »z q ) depends on 
the specific value of z Q and thus cannot be minimized with respect to the 
function r)^(* ;£(Zq) ) . Q.E.D. 

We now give the results relating to the information between z^ and 

z. . 

l 

(1) Zg Cz.* (Coordinator has less information than ith decision agent) 
Then condition (2.6.5) is automatically satisfied. 

Thus u^ = r ) i *(z i ;£*(z Q ) ) (2.6.7) 

(2) z Q cjl z^. (Coordinator has same information not available to ith 


decision agent) 
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(a) Condition 

(2.6.5) is 

satisfied 

U ± * = n i *(Z^f£*CZ ( j) ) 


Examples : 

(i) f.(x) 

i 

■ W 

g ± (x) = g^xj 

(2.6.8) 


z, 

i 

= V x i> 

z o = h 0 (Ix i ]) 

(2.6.9) 


where x^ and [xj are statistically independent. 


(ii) f . (x) = f . (x. ) 

X XX 

g ± (x) = g i <x i ) 

(2 

.6.10) 



r 


r I 

z. = h. (x. ) 

X XX 






Z 0 " 

h 0 2 c*i> _ 


1 

(N 

O 

N 

1 


( 2 . 6 . 11 ) 

CZ z^ (2 .6.12) 


(b) Condition (2.6.5) is violated. 


u. * 

l 


V ( W 


= Tl i *(z i ;P(x|z () ) ) 


(2.6.13) 


where P(x|z Q ) is the conditional probability density of x 
given z^. In this case z^ and £*(z Q ) are no longer a sufficient 
statistics for the ith decision maker. 


In words, if the coordinator has less information than the ith 
decision agent, as in the case when the information of the coordinator is 
shared by all decision agents, then the lower level problem is well defined 
given £(z^) and the information of the ith decision agent. When this is 
not true, then the structure of the system and the information pattern has 
to be compatible in a certain sense, e.g. the state of the ith subsystem 
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is statistically independent from the rest of the system and the coordin- 
ator observes that state but this information is available to the ith 
decision agent. 

Under other circumstances , the optimization problem for the ith 
decision agent may not be well-defined.without the knowledge of z^. 
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7. Solution of the Example 

Using the results derived in the previous sections, the resource 
manager would charge an optimal price £*(Zq) for the resources. Each 
division manager would then solve the following problem. 


Minimize e(ti ! (z. ;z^)R.n . (z. ;z^) 
—x —x —0 — x-ix —x — 0 

Oi'-'V 


* E.*' < 5 0 ) *d!li ( WISo > 


(2.7.1) 


Since is statistically independent of v and 9^ , the conditional 
expectation is equal to the unconditional expection given £*(Zq). In fact 
the optimal T)? ( • ; z^) is given by 




(2.7.2) 


The higher level problem is 
N 

Maximize T E{rj**(z*?z.) R.u^ (z. ?z_) 
. . — l — l —0 — x-~x —l “0 

£( 2 ^) >0 1 ~ 1 


" 2Q if' (* i »* 0 )ir i + E^JA^U^) Izq} 


-E{£’ (Zq) v| Zq} 


(2.7.3) 


* Zq) S iHi (. ‘ 2n.* ' (z. ;z Q )i i + £* izj A^* ( ' V I *o > 

E{ -n|’(z i ;z 0 )R i l 1 ?(z i ;z 0 )|z 0 } 

-E{(E{7r.|z.} - ^£(z 0 )) 1 R i “ 1 (E{Tr i |z i } - | A.* £(z 0 ) ) |2 0 > 


■7£' (z^)A. R . 1 A!£(z ) +£' (z^)A.R. V. 
4 *- — 0 — i“X -l*- —0 — 0 — x— x —l 


- c. 


c . 

X 


= - E {e{tt I I z. }r. ^e{tt.|z,}} = constant 

-l 1 — X — X -X ‘—X 


(2.7.4) 

(2.7.5) 
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Thus 

E*(5 o’ “ ar9 , s ■ 4 2 ‘V <? A?i"V )p( *o ) 

£* V - o 1=1 

N N 

+ £’ (ZqHI - e{v|z>) - l c (2.7.6) 

i=l i=l 

Comparing with the deterministic case in Section 3 we see that some 
kind of certainty equivalence (separation) theorem holds. The lower 
level devision managers choose their optimal productions by replacing 
the actual prices of their products with the best estimates given their 
measurements. However , whereas in the deterministic case the resource 
manager needs both , i=l , . . . , N and v to arrive at the optimal decision, 
resulting in essentially no decentralization in information, now it is 
only necessary to have information on v. 
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8. Discussion and Perspectives 

The decomposition achieved in mathematical programming for a class 
of systems with the general structure described in Section 2.3 is really 
with respect to computation. To study a possible decentralization in 
information we have formulated the stochastic version. It is found that 
under certain conditions a hierarchical decomposition for the problem is 
possible. The lower level decision makers need only to get certain signals 
from the higher level coordinator in addition to their information on the 
system. When these conditions are not satisfied/ then in general the 
signals are not sufficient. 

Radner and Groves [R2,G1] have considered a resource allocation 
problem similar to the one mentioned here. However, in their treatment 
there exists a resource manager who is in charge of allocating the resources 
directly. In our formulation, the resource manager serves only a coordin- 
ator. In the deterministic case, these two formulations become the same 
since the lack of an information pattern reduces the problem to the case 
of a single decision maker. 



CHAPTER 3 


DECOMPOSITION FOR NONLINEAR STOCHASTIC DYNAMIC SYSTEMS (OFF-LINE) 

1. Introduction 

In this chapter we consider the stochastic control of N coupled 
nonlinear subsystems. Each system has a controller who has noisy 
measurements on his subsystem. There is no communication between the 
controllers. The overall objective of the system consisting of all 
subsystems is the sum of individual objectives of the subsystems. 

Because of the dynamic nature of the problem, the difficulties 
encountered here are different from those in static systems. Gener- 
ally speaking, since the controls have to be applied in real time, 
on-line computation requirements for implementation of the optimal 
control strategy become important . The class of problems with different 
information patterns for the different controllers have been studied 
under the topic of dynamic teams [Al, C3, C4, H3] . So far, the results 
have not been very satisfactory in several respects. First, the optimal 
solution for even a linear-quadratic-Gaussian team is not known yet 
although there are indications of what the optimal solution should look 
like. Second, although the information structure in team decision 
problems is decentralized, often this is accompanied by an increase in 
both on-line and off-line computation. To give an example, let us 
consider the linear-quadratic-Gaussian problem. If information is 
centralized, then the optimal control strategy is given by the "separation" 
theorem and consists of the optimal deterministic control law acting on 
the estimate generated by the Kalman-Bucy filter [A2, M2, Tl] . The 
on-line computation can be replaced by building a finite-dimensional filter. 
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However, if information is decentralized, then the on-line computation 
is extremely involved since each controller has to remember all his past 
observations or an "infinite-dimensional filter" is required. For a 
discussion of this, see Willman [wi] . As for the off-line computation, 
little is known since the optimal solution is not available. However, 
the computation involved in finding a suboptimal solution to the 
dynamic team problem has been shown to be relatively complicated [C3] . 

Since the computation and implementation of a control strategy is 
as important as the optimality resulting from the strategy itself, we 
will formulate in this chapter an optimization problem which is compu- 
tionally more feasible as well as informationally efficient. The special 
coupled structure of the system and the form of the cost functional will 
be exploited. The concept of information structure is extended to include 
a priori information as well as a posteriori information. Thus the local 
controllers will not only have measurements on their subsystems alone, 
but will also be ignorant about the structure of the other subsystems. 

The coupled nature of the subsystems is taken care of by a coordinator who 
sees that certain constraints are satisfied. In this chapter we study the 
case when the coordinator has only a priori information, i.e. he does 
not make any measurements. In Chapter 5, we investigate the case when the 
coordinator makes on-line measurements. 

The dynamic team problem is stated in the next section. A decom- 
position for the deterministic problem is then stated. This will be used 
to motivate the formulation of the stochastic decomposition problem in 
Section 3. In Section 4 we formulate a constrained stochastic optimal 
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control problem as a mathematical programming problem. In Section 5, 
the problem formulated in Section 3 is decomposed. 
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2. Statement of the Problem 

We consider a discrete-time system consisting of N subsystems 
coupled together. 


x. (k+1) = f. (x. (k) ,v. (k) ,u. (k) (k) ) i=l,...., N (3.2.1) 

— X "X — X — X — X X 


V k) = L 2ii ( £} (k)) 

j^i J J 


(3.2.2) 


where 


n. 

x. (k) e R 1 is the "state" of the ith subsystem. 

—"X 

% 

v. (k) e R is the action on the ith subsystem due to the 
—a 

other N-l subsystems . 

p i . 

u. (k) e R is the control on the ith subsystem. 

r. 

£^(k) e R is the driving noise on the ith subsystem. 


f. is the state transition function. 

—l 



x^k) 


^l (k) ' 


’Il (k) ‘ 

Let x(k) = 


u(k) = 


£(k) = 

• 

• 

• 


,V k> . 


> (k) . 


• 

• 

> <k) . 


(3.2.3) 


Then v. (k) , i=l,...., N can be eliminated from equation (3.2.1) to 

““"X 

obtain a description for the whole system as 


sc (k+1) = £(x(k) , u(k) , £(k)) 


(3.2.4) 
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where the function f_ is defined in an obvious manner. 

The description in terms of equations (3.2.1) and (3.2.2) is 
preferred here to display the coupled nature of the system. Note that 
even though x(k) can be regarded as the state of the system if the driving 
noise is absent, x^(k) is, strictly speaking, not a state for the ith 
subsystem since knowledge of x^OO , together with all the control u^(j) , 
j k is not sufficient to determine the future behavior of the ith 
subsystem. 

The cost functional for the whole system is a sum of cost functionals 
for the individual subsystems, i.e., 


N 

J * J J. (3.2.5) 

i-1 1 

T-l 

J. = e{k. (x, (T) ) + l L. (x. (k) ,u. (k)) } (3.2.6) 

X X — 1 , - X — X T. 

k=0 


It is required to minimize J. The expectation is taken with respect to 
all the primitive random variables. 

The problem is not yet well defined because we have not specified 
the information pattern of the system. 


Let 


Xi<k) - ^(x^k) , I ± (k)) i-1, N (3.2.7) 


m. 

l . 


^(k) e R is the measurement on the ith subsystem by the 
ith controller. 


m. 
r . 


6. (k) e R is the noise corrupting the measurement. 


Let 


Y(k) = {^(S) 0 _< s _< k, i=l, N} (3.2.8) 
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U(k) = {u.(s); 0 < s < k, i-1, n} (3.2-9) 

““1 

k=0, . . . ,T-1 

Let (Y^(k) ,lh (k-1) ,1^) be the information available to the ith controller 
at time k. 

Y ± (k) CY(k) ; U ± (k-1) CD(k-l) (3.2.10) 

1^ is the a priori information of the entire system available to the ith 
controller. 

Then u^(k) is required to be a measurable function of Y^(k) and lb (k-1) 
which can be generated from 1^, i.e., 

u,(k) =Y. k (Y. (k), U. (k-1); I.) (3.2.11) 

is introduced to take into consideration structural information 
of the system. The information available to the ith controller thus con- 
sists of two kinds: a priori (structual) information of the system and 

a posteriori (measurement) information. 1^ essentially specifies the 
complexity of the control strategy. In the system given, if 1^ = 

tu) , tdien as far as each controller is concerned, he is controlling 

an uncoupled system with an unknown input v. (k) . His control law Y. k would 

■ — ■ - i — — —a -*-i 

thus depend only on the parameters of his subsystem. This control 

law is thus "simpler", although a "loss" in mathematical optimality results. 

In most of the work done thus far, [H3, C4, C 3] decentralization refers 
mainly to measurements, i.e., a posteriori information. The structure of 
the whole system is assumed known to each controller. With this a priori 
information, decentralized a posteriori information almost inevitably gives 
rise to a more complicated control strategy than centralized a posteriori 
information because each controller tries to generate the missing measure- 
ments using the common a priori information. The amount of on-line compu- 
tation involved always increases, as well as the amount of off-line 
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computation. Even when the on-line computation is constrained by choosing 
suboptimal control structures, as in Chong and Athans [C3] , the off-line 
computation required is still tremendous. In the implementation of con- 
trol laws, computation considerations are as important as information 
considerations. This leads us to consider decentralized a priori infor- 
mation, at a sacrifice in overall optimality. 

There is some work in the control literature which is vaguely 
related to decentralized a priori information. This is found in Ref. [Ml] 
and [PI] and can be essentially illustrated by the following theorem for 
deterministic systems. 

Theorem 3.2.1 : Consider the optimal control problem given by 

System: x. (k+1) = f . (x. (k) ,v. (k) ,u». (k) ) i=l, . . . . ,N(3.2.12) 


v. (k) * Y g. . (x . (k) ) x(0) given 

N 

Cost functional: J * T J. 

i-i 1 

T-l 

J. = K. (x. (T) ) +Y L. (x. (k) ,u. (k)) (3.2.13) 
1 1 — 1 - 1 -r X —1 

k=0 

Suppose there exists a constrained saddle-point (x* /U* / v*,£*) to the 
problem 

L( 3 C*,\ 1 *, V*,£) L(x* ,U* , V* ,£*) _< L(x,£,v,£*) (3.2.14) 

k= 1 T / i*”l ,...., n} 
k=0 , . . . . , T— 1 j I - 1,...., n} 
k=0 T- 1 } i~l , . . . . ^ n} 
k=0 ,...., T— 1 f i = l ,...., n} 


where x = {x. (k) ; 

— —i 

u = {u. (k) , 

— — 

v = {v. (k) ; 
— “—1 

£ = {£. (k> ? 
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X/U,v are constrained by equation (3.2.12) 

N T— 1 

L(x,u,v,£) = J + l (k) (v. (k )- l 3 ..(x.(k))) (3.2.15) 

i=l k=0 ^ j+i 3 


Then the optimal control problem can be solved as a two-level problem. 

Lower Level: Minimize J. (u. ,v. ,p) 

u. ,v. 


T-l 

J, (u. ,v.,p) = K. (x. (T)) + Y L. (x. (k) ,u. (k)) + pj (k) v. (k) 

k=o 



(k) 


% 


(3L 00 ) 


(3.2.16) 


x. (k+1) = f. (x- 00 ,v. (k) ,u. (k) ) 

— x — x — X —x — X 

N 


(3.2.17) 

(3.2.18) 


Higher Level : Max £ J* (jd) 

£ i==l1 

where 3 ^*(jd) is the minimum obtained in equation (3.2.16). 

Proof: The results in Section 2.3 are used. L (x.'il'X'E) split up into 

uncoupled CL’s by collecting all the terms involving x^,v^ and u^« 

If the optimal £* is given , then the lower level control problems 
are all uncoupled. The optimal control can be found using only 

the structure of the ith system (its system dynamics and cost functional) 
plus the interconnection functions g_.^(*), j+i* The structural information 
of the other subsystems are not required. On the other hand, jo* is deter- 
mined vising all the optimal J^* (jd) ' s. Although algorithms can be 
devised making use of the special two-level structure of the optimization 
problem, the convergence to the optimal solution is not accomplished in real 


time [PI] . Thus the decomposition achieved is really with respect to the 



off-line computation. In the deterministic problem given above, this 
corresponds to finding the open- loop control functions in some decentralized 
manner. In the next section we shall show that this philosophy can be 
extended to the stochastic case. 
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3. Formulation of the Stochastic Decomposition Problem 

In the deterministic case given above, v^ is the action of the 
other subsystems on the ith subsystem, a quantity which is needed for 
the optimal control of the ith subsystem but is not itself optimized. 
However, if the constraint v. (k) = > g..(x.(k)) is satisfied 

^ “3 


exactly, optimizing with respect to and _v\ simultaneously is equivalent 
to solving the original optimal control problem with u^ as the only 
control to be optimized. In the actual implementation of the control, 
only u^ is used. 

For each lower level problem, v^ can be regarded as an estimate 

of the interaction given £. If the optimal £* is used, then v^ is equal 

to the action of the other subsystems exactly. 

We now extend this philosophy to the stochastic case. Instead 

of solving for the problem described by equations (3.2.1), (3.2.2), 

(3.2.5), (3.2.6) and (3.2.11) we shall exploit the coupled nature of 

the system. Since x(k) given the control strategy is a random vector, 

it As no longer possible to choose v. (k) such that it equals ) g. ,(x.(k)) 

~~ l j+i ^ ^ 

exactly. Rather v. (k) is only required to be an estimate of the interaction 
and this is the job of the coordinator. We thus have the following formu- 
lation. 

Problem 3 . 1 : 


Given 


x. (k+1) = f . (x. (k) , v. (k) ,u. (k) ,£. (k) ) 
— x — i — 1 — 1 — i — 1 


(3.3.1) 


1“1 t • • • • t N 


N 


j = y j. 

i=i 1 
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T-l 

J. = ElK. (x. (T)) + ][ L. (x. (k) ,u. (k)) } 

1 1-1 k-o 1 " 1 “ rL 



1—1 !••••! N 
k=0 , . . . . T-l 


u. (k) = X 5 k < Y i < k > » D 4 (k-1) ; I) 

JjOO - D i k (I) 

Y i (k) » {^(s) ; 0 <_ s £ k} 

U. (k) {u. (s) ; 0 < s < k} 

x — x — — 


(3.3.2) 

(3.3.3) 

(3.3.4) 

(3.3.5) 

(3.3.6) 

(3.3.7) 


k h *\j 

Find ^ and , i=l,...., N; k=0,....T-l such that J is minimized. I 

consists of the a priori information contained in the model and the cost 
functional. 

The original stochastic control problem has been modified in the 

following manner. The subsystems are all assumed to be uncoupled. The 

interaction of the other subsystems is represented by v^(k) which is to 

be optimized, v. (k) is chosen, however, so that constraint (3.3.3) 

— 1 

is satisfied; thus it is an unbiased a priori estimate of the interaction 
of the other subsystems . The control problem then consists of finding 
the optimal control strategies and the optimal estimates of the inter- 
actions such that the cost functional is minimized. 

Although this problem is very similar to the deterministic problem 
given in Section 2 of this chapter, the results of decomposition in mathe- 
matical programming cannot be applied directly since closed loop control 
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strategies are required. In the next section, we show how the 
stochastic control problem cam be reformulated so as to lead to a 
constrained optimization problem. 
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4. A Constrained Stochastic Optimal Control Problem 

Consider the following stochastic control problem. 

Problem 3.2 : 

System: x(k+l) = f (x(k) ,u(k) ,£ (k) ) x(k) £ R° (3.4.1) 

Measurement: y(k) = h(x(k) ,£(k) ) u(k) e R^* (3.4.2) 

T-l 

Cost functional: J » E{K'fc(T)) + £ L(x(k) ,u(k) ) } (3.4.3) 

k=0 

£_(k) ,6_(k) , k=0 , . . . , T-l and x(0) are random vectors with known 

statistics . 

Y(k) C {^(0) ,...., £(k) ; u(0), ,u(k-l)> (3.4.4) 

u(k) is constrained to be an admissible function of Y(k) , i.e., 

«(W-Artk )) ( 3 . 4 . 5 ) 

y e r 

it is required to choose y* e T such that 

J(Y*) = Min J(Y) (3.4.6) 

Y € r 

In the problem stated above, the minimization is only over the 
strategy space r. we can transform this to a minimization over random 
sequences subject to certain constraints. 

Let the underlying probability spaces be (ft,8,y). 0^k) , £(k) , x(0) are 

random vectors over ft. 

•y 

Let 3t(o )) = (3s(l,co) , . . . ,x(T,w) ) be a 8-measurable L fimction over ft into 
k nT , i.e. , x e L 2 (ft, R nT ) 

Let u(0)) = (u(0,W) ,. . . ,u(T-l,U)) ) e L 2 (ft,R pT ). 
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Let 

= {x e L 2 (fi,T nT ), vi e L 2 (J2,R^) | it(k+l,w) = ^(ic(k,a}) , u(k,io)) a. 
= set of x,vi which correspond to the given dynamic system (3.4 

= {jc £ L^(fi,R nT ) , u £ L 2 (J2,R^) |3y £ T such that 

u(k,to) =y k (Y(k,w)) a.e.} 

= {x E L 2 (ft,R nT ), u£ L 2 (fi,lP T ) | 3 Y E r such that 

u(k,w) = y k (h(x(0,aj) , £(0,w)) f h(x(l,tt) , j9(l r w)) , . . . , 

h(x(k,U)) , 0/k,u)) ) ; u(0,U)) , . . . ,u(k -1,(0) ) a.e.} 

= set of x,u which can be generated from the given information 
structure and admissible control strategy. (3.4 

Let 6 : r L 2 (fi,R nT ) x L 2 (fi,R^* T ) be defined as 

G(Y) = (x(Y) , u(Y) ) (3.4 

Then by the definition of and S 

Range G = f| S 2 (3.4 

Therefore 

Min J(Y) = Min J(x(y) > u (y)) 

Y e r y e r 

= Min J (x(Y) / u(Y) ) 

G(Y) e S 1 H S 2 

= Min J(x,u) (3.4 

(x,u) £ n s 2 

Note that the minimization is now over random sequence x,u. The 
dynamics of the system, the constraint on the control strategy and the 


e.} 

.7) 


. 8 ) 

.9) 

. 10 ) 


. 11 ) 



information structure allowed have been incorporated into the constraint 

set S, PI s_. 

1 2 

We next consider the constrained stochastic control problem. 


Problem 3 . 3 : 

System: x(k+l) = f(x(k) , \i (k) , £^(k) ) (3.4.12) 

Measurement: y(k) = h_(x(k) , 0_(k) ) (3.4.13) 

T-l 

Cost Functional: J = e{k(x(T)) + J L(x(k) , u(k) ) } (3.4.14) 

k=0 

u(k) =y k (Y(k)) (3.4.15) 

E{H(x(k), u(k))} = 0 £ R q k = 0,...T-1 (3.4.16) 

It is required to choose y* e T such that 

J(y*) » Min J(Y) 

Y e r 


and the constraint (3.4.16) is satisfied. H is a vector-valued function. 

This constraint is only required to be satisfied on the average. 

Problem 3.3 can be transformed into the following unconstrained stochastic 
control problem. 

Problem 3.4 : 

System: sc(k+l) = f^x/k) , u(k) , £_(k) ) (3.4.17) 

Measurement: ^(k) = h(x(k) , 0_(k) ) 

T-l 

Cost Functional: J(Y»£) = e{k.(xCD) + \ L(x(k) , u(k) ) 

k=0 


(3.4.18) 
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It is required to find Y* such that J(y,£> is minimized. 

Theorem 3.4.1 : Suppose a saddle point exists for the stochastic control 

problem 3.4, i.e. there exist Y*,£* such that 

J(Y*»£) < J(Y*,£*)< J(Y,£*) (3.4.21) 

Then Y* is the solution to Problem 3.3. 

Proof : The constraint (3.4.16) can be written as 

H(x,u) = 0 e R qT (3.4.22) 

where 

x e L 2 (Q,R nT ) , u e L 2 (fi,R pT ) . 

Problem 3.3 is then equivalent to 

Min J (jc,u) 

x,u £ Si H S 2 

H (x , u) = 0 (3.4.23) 

T-l 

J(Y*E> = e{k(x(t)) + l L(x(k) , u(k)) + £' (k)H(x(k) , u(k))> 

k=0 

T-l T-l 

= e{k(x(t)) + l L(x(k) , u(k) ) } + l £'(k) E{H(x(k), u(k))} 
k=0 k=0 

= J(Y) + £' H(x(X), u(Y>) (3.4.24) 

If J(Y,£) has a saddle point (Y*/£*)# then (G(]£*) ,£*) is a saddle point for 
the function j(x,u) + £'H (x,u) . 

By Theorem 2.3.1, G(Y*) = (x*,u*) solves 
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such that 


Min J (x, u) 

(x,u) e S ]L fl S 2 


H(x,u) = £ 


(3.4.25) 


or Y* solves Problem 3.3. 

The following corollary follows immediately. 


Q«E.D. 


Corollary 3.4.2 ; If a saddle point (y*,£*) exists for Problem 3.4/ then 
the optimal strategy Y* can be found by 

T-l 

Max Min e{k(x(T) ) + J L(x(k) , u(k) ) + £' (k)H(x(k) , u(k) )} (3.4.26) 

£ Y k=0 

Proof : We need only the fact that if a saddle point (Y*»£*) exists for the 

function L(Y»£) / then 

Min Max L(Yz£) = Max Min L(y»£) * l(y*»£*) (3.4.27) 

YE E Y 

To check for the saddle point, we need to verify the condition directly 
or use condition (3.4.27). The following condition is sometimes more 
convenient. 

Lemma 3.4.3: Consider the problem 


Min f(x) 
x 

g(x) = £ x e C 
if 


(1) Max Min f(x) + £'g(x) A f(x*) + £*’g(x*) exists 
£ x e C 


(3.4.28) 


(2) g(x*) = £ 
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then x* minimizes f(x) such that g(x) = 0_,x e C. 

Proof: 

f (x*) + £'g(x*) = f(x*) = f(x*) + £*'g(x*) (3.4.29) 

f(x) + £*' g(x) >_ Min f(x) + £*’g(x) (3.4.30) 

x £ C 

Min f (x) +£*' g(x) « f(x*(£*)) + £*' g(x*(£*)) (3.4.31) 

XEC 

where x*(£) minimizes f(x) + £*g(x) , x e C. 

Thus 

Max Min f(x) + £'g(x) = Max f(x*(£)) + £'g(x*(£)) 

£ xec £ 

= f (x* (£* ) ) + £* 'g(x*(£*)) by definition 

= Min f(x) + £*'g(x) (3.4.32) 

x e C 

Then 

f (x*) + £'g(x*) <_ f(x*) + £*'g(x*) £ f(x) + £*’g(x) 

for all xec and £ (3.4.33) 

(x* »£*) is a saddle point cmd x* minimizes f(x) such that g(x) = £,x e C. Q.E.D. 
Theorem 3.4.1 can then be restated in the following form. 

Theorem 3.4.4 : Suppose 

T-l 

Max Min e{k(x(T) ) + £ L(x(k) , u(k) ) + p' (k)H(x(k) , u(k) ) } 

£ Y k=0 


exists for the system described in Problem 3.4, and further 
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e{h(x* ( k) , u* (k) ) > = 0 k = 0, . . . ,T-1 


where x*,u* are the optimal trajectory and control using y*. Then y* is 
the optimal strategy for Problem 3.3. 



5. Decomposition of the Stochastic Control Problem 


We now apply the results of the last section to Problem 3-1 and 
transform it to an unconstrained problem. 

Theorem 3.5.1: Consider the system 


x. (k+1) ■ f. (x. (k) , v. (k) , u. (k) , (k)) (3.5.1) 

~i “ i — i ”” i — t "t. 

k 


u^(k) = ^ (Y i (k) , U^k-l) 1 1) 
v, (k) = ^*<2) 


N 


J “ I J. 


(3.5.2) 

(3.5.3) 

(3.5.4) 


i=l 


T-l 


J. = e{k. (x. (T) ) + l L. (x. (k) , u. (k) ) + p. T (k) v. (k) 

1 1 — 1 “ _ 1 3- 1 — 3. 

k=0 

- £j' w 


(3.5.5V 


If Max Min J exists and 

e i#n 

N 

E{v*(>0 -T g, .(x.»(k))} - 0 i=l, k-0,...,T-l (3.5.6) 

jVi 


then X*'H* are fc he optimal strategies for Problem 3.1. 

Proof : This problem can be cast into the form of Problem 3.3 by identifying 


u(k) with {u. (k) , v. (k) ; i=l,...,N) (3.5.7) 

' — — X — 1 

Y k with {Xjkf ZLj** i=l,...,N> 


(3.5.8) 
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N T-l N 

l E{K. (x. (t) ) + l L. (X. (k) , U. (k) ) + p • (k) (v. (k ) - l £. .(X. (k) ) } 
i=l 1 k=0 1 ^ 1 D ^ 

N T-l 

= £ e{k. (x. (t) ) + I L. (x. (k) , u. (k) ) + £. ' (k) v. (k) 
i=l k=0 





(X, (1c) ) } 


N 


I 5. 
=1 


(3.5.10) 


Theorem 3.4.4 can then be applied in a straight forward manner. Q.E.D. 

Note that given any £, the minimization problem is separated into N 
uncoupled stochastic control problems. The ith controller needs only the 
structure of his own system as his a priori information. Thus there is 
decentralization of a priori as well as a posteriori information. 

A two- level hierarchical decomposition for finding the optimal control 
strategy is possible. 

Lower Level: x. (k+1) = f. (x. (k) , v. (k) , u. (k) , E,. (k) ) 

—x —a —a “X — x “X 

u.(k) =X, k (Y (k), U. (k-1); I) 

"““X -‘“X x X 

Vi<k) = rj i k (i) 

T-l 

J. (£) = E{K. (3C, CT) ) + l L. (X. (k) , U. (k) ) + £, ' (k) V. (k) 

k=0 

-I £.'(k) 2 ..(x.(k))} (3.5.11) 

j*i ~ 

1c k ^ •*# 

Find and such that J^(£) is minimized, i=l,...,N. Let J^*(£) be the 

optimal cost associated with a particular £. 
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N 

Higher Level : Maximize £ J.*(£) (3.5.12) 

£ 1=1 1 

Remark : The higher level problem is deterministic and static in nature 

whereas the lower level problems are stochastic and dynamic, although 
uncoupled. The decentralized a priori information allows the off-line 
computation to be done in an algorithmic manner. Typically, the higher 
level coordinator will choose a £, the lower level controllers then compute 
the optimal cost associated with this £. The coordinator then chooses 
another £ to increase the optimal cost of the lower level systems. The 
decomposition is off-line because it is done before the system starts vising 
only a priori information. The advantages of this approach are the 
following: 

(1) The overall stochastic control problem is split up into N 
stochastic control problems with lower dimension. Each 
of these can be solved more easily. 

(2) Although the value of £ may change, the structure of the 
lower level problems remains the same, and hence essentially 
the stochastic control problems need only to be solved once. 
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6. Discussion and Perspectives 

We have considered the stochastic control of N coupled systems with 
decentralized information structure. By defining a new kind of optimality, 
it is found that the optimal control strategies can be found in a decen- 
tralized maimer. Moreover, given the optimal coordinating parameters, the 
control problems of the N subsystems are uncoupled. Thus the control 
strategies using decentralized a posteriori information can be computed 
with decentralized a priori information. Although this scheme is sub- 
optimal with respect to the ordinary stochastic control problem, computation- 
ally it is more efficient. 

Because of the nonlinear nature of the problem we cannot say much about 
the detailed computations involved. However, it is obvious that instead 
of one high dimensional stochastic control problem we now have N lower 
dimensional stochastic control problems and one extra deterministic 
optimization problem to be solved by the coordinator. In the next chapter, 
we shall look at the linear-quadratic-Gausian problem in detail and obtain 
explicit solutions for these lower and higher level problems. 



CHAPTER 4 


DECOMPOSITION FOR THE LINEAR-QUADRATIC-GAUSSIAN PROBLEM 

(OFF-LINE) 


1. Introduction 

In this chapter we apply the philosophy of Chapter 3 to the 
linear-quadratic-Gaussian problem. As pointed out in the introduction 
of Chapter 2, the solution to the linear-quadratic-Gaussian dynamic 
team is not known yet. Even if it is found, the on-line computation 
involved will make its implementation not feasible since the estimates 
involved have to be generated by infinite dimensional filters. The 
control strategies obtained in this chapter are easily implementable . 

The structure of the control decomposes very nicely into an open 
loop part and a closed loop part. This will be used later on to study 
the on-line "periodic" coordination of coupled systems (see Chapter 5) . 

In the next section, we formulate the LQG problem and decompose 
it into two levels. The equations needed by the lower level controllers 
and the coordinator are given in Section 3. The lower level problem 
with a linear term in the cost functional is solved in Section 4. 

In Section 5, the higher level problem is solved and found to bear 
a very close relationship to the deterministic linear quadratic 
problem. 
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2. Statement of the Problem 


Consider the linear dynamic system 


x. (k+1) = Aj..JC.(k) + Vj (k) + B^u,. (k) + j^OO i=l,...,N 


. (k) = £ A. .x. (k) 

•~L u 1 H T ■! 




(4.2.1) 

(4.2.2) 


The cost functional is quadratic. 

N N T-l 

J " I J i " l E^' (Tte^fa) + l Si' (k)2 i x i (k) + ^'005^00} 


i=l i=l 


k=0 


(4.2.3) 


where F . , Q. , R. are positive definite matrices. 

— 1 "“X — 1 

The measurements are given by 

^(k) = C^3£^(k) + 0^(k) i=l,...,N (4.3.4) 

Each controller is allowed only to use his past measurements to find the 
controls, i.e.. 


where 


u. (k) = X- (Y, (k) , U. (k-1)) 

“X ■*'X X X 

Y i (k) = {^(0) »...»Xi(k)) 

U. (k) = {u. (0) , . . . ,u. (k) } 


(4.2.5) 

(4.2.6) 

(4.2.7) 


It is required to find optimal control strategies such that J 
is minimized. 

Jj^(k), k=0,...,T-l are independent Gaussian variables with zero mean 

and covariance = . (k) . 

— 1 

0.(k)» k=0,... ,T-1 are independent Gaussian variables with zero mean 


and covariance 0. (k) 
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x. (0) is Gaussian with mean x. (0) and covariance £ . (0) . 

“X — 1 . —1 

fj^Ck), 0^ (k) , x^(0) , i, j, h = 1,...,N are all mutually independent. 

The matrices A. . , A. . , B. , C. , Q. , R. can be time-varying but for 
— 11 — 13 “1 “1 *“1 —1 

simplicity of notation, the dependence on k has been omitted. 

The general solution to this problem, assuming no communication of the 
a posteriori information between the controllers, is not known, although 
several particular cases have been considered [Al, C3] . We propose to 
solve this problem using the approach suggested in the previous chapter by 
defining a new kind of optimality. 

Problem 4.1 : 

x. (k+1) = A; -iiLs 00 + v. M + ItiEj W + ?,• OO (4.2.8) 

—1 — 11—1 —1 — 1—1 — 1 

N N T-l 

J = I J . = l e{x. ' (T) F.X. (T) + J x . ' (k)£.x. (k) + u. ' (k)R.u. (k) } 
i-1 i=l k=0 

(4.2.9) 

E{v. (k) - T A x.(k)} =0 (4.2.10) 

k 13-3 

u^k) ■- ^(Y^k) , U^k-1); I) (4.2.11) 

v^k) = 1^(1) (4.2.12) 


I consists of the a priori information contained in this model. It is 

1c Jc 

required to find and l]^ such that J is minimized. 

Using the results of Section 3.5, we obtain the following two-level 
problem. 


Lower Level: (Problem 4.2) 

x. (k+1) = A. . x. (k) + v. (k) + B.U. (k) + E. (k) 
—1 — 11—1 —1 — 1—1 —1 
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u. (k) =Y. k (Y. (k), U. (k-1); I) 

1 *1 1 1 

v. (k) = n. k (i) 

— i i 

T-l 

J. = e{x. ' (T)F.x. (T) + T x. 1 (k) Q. x . (k) + u'(k)R.u.(k) 

X — X — 1“X , _ — X “l—i — — x — X 

k=0 

+ £ i '(k)v i (k) -£^(k)x^(k)} 

EiW = ^ji' £jW 

k k ~ 

It is desired to find » ]T^ to minimize CT(p), i=l,...,N. 
Higher Level; (Problem 4.3) 

N 

Maximize £ J . * (£) 

£ i=l 1 

where J* (£) is the optimal cost in Problem 4.2 for a particular £. 


(4.2.13) 


(4.2.14) 


(4,2.15) 
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3. Structure of the Decomposition 

In this section we summarize the relevant equations needed by the 
lower level controllers and the coordinator. 

The' optimal control of the ith controller is given by 


u.*(k) = -D. (k+1) (x. (k) - x. (k)) - E. (k+1) p. (k) 
— x — x — x —x — x *-x 


(4.3.1) 


The gain matrices are given by: 


D. (k+1) = T. _1 (k+1) B.’K. (k+1) A.. 


(4.3.2) 


T. (k+1) = R. + B.’K. (k+1) B. 


(4.3.3) 


- 1 , 


K. (3c) =0. + A.’.K. (k+l)A. . - A. . ’K. (k+l)B.T. (k+1) B. 'K. (k+1) A 

— i ■ a i —li—i — ii —li —i —i—i —i —i — ; 


K ± (T) = f\ (4.3.4) 


E. (k+1) = ^R." 1 (k) B.'(k) 

— 1 2 “ X “X 


S. (k+1) = K. (k+1) - K. (k+1) B.T." 1 (k+l) B. 'K. (k+1) 

— X —X —X — I“X — x — X 


(4.3.5) 

(4.3.6) 


The estimates Si. (k) and x. (k) are generated as follows. 

— X — X 

x^ (k+1) A E{x^(k+1) |y i (k+l) ,1b (k) } 

= A. .x. (k)+v. * (k)+B. u. *(k) +G. (k+1) [y. (k+1) -C. (A. . x. (k)+v. *(k) +B.u. (k) ) 
— XX— X —X — x— X “X —x — XX— X —x — x— x 


x. (0) = x. (0) 


(4.3.7) 


where 


G. (k+1) = Z, (k+llkja’ [^^(k+llkjc. ' +0 H (k+l)] 


-1 


(4.3.8) 
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Z. (k+l|k) = =. (k)+A. . [Z. (k | k-1) -Z . (k | k-1) C . ' (C.Z . (k|k-l)C. '+ © . (k) ) 


-1 


C.Z.Cklk-DJA..' 


Z .(o|-l) = ^( 0 ) 
x. (k+1) = e{x. ( k+1) } 


(4.3.9) 


= -K i ' 1 (k+1) (k+1) + “^(k)] 


(4.3.10) 


v.*(k) and r. (k) are given by 


v*(0) =-A..x.(0) - K 1 “ 1 (i)£.(l) - 4 S { " 1 (1)£ { (0) (4.3.11) 

X — XI — 1 —X — 1 2 “"X T. 

v^k) = * ii K i " 1 (k) [r^k) +-|p i (k-l)l - K i ' 1 (k+l)r i (k+l) 


j£ L " 1 (k+l>E i (k) k=l, . . . ,T-1 


(4.3.12) 


r.(0) = - j£.(0) - - ^. , S.(l)i ii x i (0) (4.3.13) 

fi .K.- 1 (k)r i (k) =-|£.(k) "“A. i , £ L (k) + “A ii , S i (k+l)A ii K." 1 (k)£.(k-l) 


k=l , . . . ,T-1 


(4.3.14) 


r. (T) = 0 


(4.3.15) 


The structure of the control mechanism is illustrated in Fig. 4.1. 
The gain matrices can all be computed off-line, along with r^(k) and 
v.*(k) , which depend on p(k) . K. (k) is the solution of the Riccati 

— X “■ “X 

equation assuming the systems are uncoupled and (k) is the optimal gain 
matrix for each of the uncoupled deterministic optimal problems. 
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TH 

Fig. 4.1 Structure of Control for i Controller 
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x. (k) is the unconditional mean of x. (k) by the ith controller given only 
—x —i 

his a priori information. It can be computed off-line given r^(k) and £^(k). 

x^(k) is the best estimate of x^(k) given the measurements of the ith 
controller and his a priori information. It is generated using v^*(k) cal- 
culated off-line and the on-line measurements u^*(k) and £^(k) . 

The coordinator finds the optimal £*(k)'s by solving the following 
deterministic two-point boundary value problem 


x(k+l) = A x(k) - - B R -1 B'A(k+l) 


(4.3.16) 

(4.3.17) 


A (k) = A’ A (k+1) + 2£ x(k) 
x(0) given 

Mt) = 2P x(T) (4.3.18) 

£*(k) - -A (k+1) k=0,...,T-l (4.3.19) 

The matrices A, B. and 2 are as defined in Section 5. R and £ are given 


by 



"*L 

lo 

• 

• 

• 

1 


-1 

|o 

0 

• 

• 

• 

-» 

R A 

0 

• • 

£2 . . . 

F A 

0 

• • 

—2 


• • 

. 


• • 

4- 


Alternatively, £*(k) can be expressed as follows. 
£* (k) = -2K(k+l)x(k+l) 


(4.3.20) 


where 


x(k+l) = (A-B T _1 (k+1) B’K(k+l) A) x (k) 
jc( 0) given 


(4.3.21) 
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K(k) = 

£ + A'K(k+l) [r-B T” 1 (k+l)B , K(k+l) ]A 



1* 

h3 

» 

|«j 

(4.3.22) 

T(k+1) 

= R + B'K(k+l)B 

(4.3.23) 
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4, Solution of the Lower Level Problem 

Since each controller knows the structure of his system as defined in 

Problem 4.2 we shall not include the a priori information in specifying 

the information structure of the controller. Thus (k) would depend on 

Y, (k) and U. (k-1) while v. (k) is allowed to depend on the a priori informa- 
i 1 ■"! 

tion only. 

The problem as stated has a nonquadratic cost functional and controls 

which depend on different information sets. However, the information of 

v^(k) consists of a priori information only and thus is included in that 

of u. (k) . This makes things easier than the general dynamic team problem 
—1 

and the following theorem can be used. 


Theorem 4.4.1: 

Consider the system 

x(k+l) = jP(x(k) , u(k) , v(k) , £(k) ) 

u(k) 

v(k) = t)Nz(k)) 

Z(k) C Y(k) 

£(k) is a white noise process driving the system and Y(k) 
information available to the controller. 

Y (k) = {^(0) , . . . ,£(k) ; u(0) , . . . ,u(k-l) ; v(0) , . . . ,v(k-l) } 

£(k) = h(x(k) , 0 (k) ) 


(4.4.1) 

(4.4.2) 

(4.4.3) 

(4.4.4) 
Z(k) are 

(4.4.5) 

(4.4.6) 


^(k) is a white noise process. 
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T-l 

J = e{k(x(T) ) + l L(x(k), u(k) , v(k))} (4.4.7) 

k=0 


Then the optimal cost is given by 

e{V(Y(0),0)} (4.4.8) 

where V(Y(k),k)) satisfies the functional equation 

V(Y(k),k) = Min E{b(x(k) , u(k) , v(k) ) + V(Y(k+l) ,k+l) |Y(k) } (4.4.9) 
u(k) 


V(Y(T),T) = E{K(x(T)) | y(T) } (4.4.10) 


Proof: 


Define 


V(Y (k) ,k) 


= Min 

u(k) ,l k+1 / ...,X T ‘ 1 
k k+1 T— 1 

n f • • • 


T— 1 

E ([ L(x(t) ,u(t) ,v(t)) + K(x(T)) |Y(k)} 

t=k 


= Min {e{l (x (k) , u(k) , v(k) ) | Y (k) } 
u(k) 

n k 


T-l 


+ Min 

I k+1 

k+1 T-l 

n t • • • 


e{£ L(x(t) ,u(t) ,v(t)) + K(x(T) ) [Y(k) }} 


•-I 


T-l t=k+l 


(4.4.11) 


Note that the minimization is done with respect to u(k) , and the control 
k+1 T— 1 k T-l 

strategies X > • • • »X • The first term in the minimization 

k+1 T-l 

is separated from the rest because it does not depend on X • • • • »X • 

k+1 T-l 

f) » • • * #XL • 
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Using Lemma A. 3 (in Appendix A), 


Min 



T-l 

e{£ L(x(t) ,u(t) ,v(t)) + K(x(T)) |Y(k) } 
T-l t=k+l 

'X 


= E{Min 


k+2 

u(k+l) 

k+1 T-l 

a » • • • #2, 


T-l 

e{£ L(x(t) ,u(t) ,v(t)) + K(x(T) )|y (k+1) } | Y(k) } 
T-l t=k+l 

'X 


-• e{v(Y (k+1) ,k+l) | Y (k) } 


(4.4.12) 


From this and equation (4.4.11) we obtain equation (4.4.9 ) and further 


V(Y(0) ,0) = Min 


T-l 

E{J L(x(k) ,u(k) ,v(k) ) + K(x(T)) | Y(0) } 

1 T-l v=0 “ 

u(0) *X , ...» 1 
0 1 T-l 

a ' a * ••• » a 


Again by Lemma A. 3 


T-l 

Min E{£ L(x(k) ,u(k) ,v(k)) + K(x(T) ) } 

0 T-l k=0 

X • • * • 'X 

0 T-l 

a 


T-l 

= E{Min E{J L(x(k) ,u(k) ,v(k) ) + K(x(T)) | Y(0) }} 

1 T-l k=0 

u(0),x 

0 1 T-l 

a »a * * »a 


= E{V(T(0) ,0)>. (4.4.14) 

We can then apply this theorem to solve the lower-level problem. This 
will be stated in the following theorem. Since we have a linear system, 
with Gaussian driving and observation noises, the information Y(k) can be 
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replaced by the sufficient statistics x. (k) . From now on we would deal with 

— 1 

V. (x. (k) ,k) instead of V. (Y. (k) ,U. (k-1) ,k) . 

1 — 1 1X1 


Theorem 4.4. 2j 


The solution to the lower level problem is given by 


u.*(k) = -D. (k+1) (x. (k) - x. (k) ) - E. (k+1) d. ( k) 

—1 — 1 — 1 -“1 —1 *1 

^*(0) = - K.‘ 1 (l)r i (l) - 

v.*(k) - !& <k) ♦ Tfc 111 ' 1 )! - K." 1 (k+l)r i (k+1) 

*“1 ”” 11 — 1 “1 2 *1 — 1 —a 


is i ‘ 1 (k+l) Ei (k) 


k=l, . . . ,T-1 


(4.4.15) 


(4.4.16) 


(4.4.17) 


where D. (k) , E. (k) , x. (k) , x. (k) , r. (k) , K. (k) and S. (k) are as given in 

“i — i —i —I —1 — i —a 

Section 3. Moreover, the optimal cost is given by e{v. (x. ( 0),0)} where 

1 — 1 

V. (x. (k) ,k) - x.'(k)K. (k)x.(k) +2r. ' (k)x. (k) + s . (k) (4.4.18) 

i — i — l — l — x — l — 1 1 

with 

s . (k) = s . (k+1) + 2r. ' (k+l)v.*(k) + v.*' (k)K. (k+1) v.*(k) 

i i ~”i — i — i —1 — l 

- [K. (k+1) v*(k) + r. (k+1) ] ’B.T; 1 (k+l)B' [K. (k+1) v*(k) + r. (k+1) ] 
—1 — 1 —1 ”" 1“-1 —1 — 1 “1 

+ Ei' (k)v i *(k) + trg^E^ (k | k) + tr^(k+l) (j^ (k+lIkJ-E^k+ljk+l) ) 


s . (T) = trF.Z. ( t|t) 
1 


(4.4.19) 


E . (k| k) = Z . (k|k-l) - Z. (k|k-l)C. * [C.Z. (k|k-l)C. ' + 0 . ] -1 C.Z . (k|k-l) 


(4.4.20) 


E.(k+l|k) = A. .Z . (k|k) A. . ' + =(k) 


Z.(0|-1) = E. (0) 


(4.4.21) 



Proof ; 


The functional equation corresponding to this problem is 

V. (ft. (k) ,k) = Min e{x. ' ( k)Q. x. (k) -p.'(k)x. (k) + u. ' (k) R. u. (k) 

1 “1 . — 1 — X *1 — X — 1 — 1 — 1 

u. (k) 

— X 

v. (k) 

"X 

+ p. • (k)v. (k) + V. (x. (k+1) ,k+l) lx. (k) > (4.4.22) 

—1 X —X — X 

where (k) is to be independent of any a posteriori information. 

If we let V. (x. (k) ,k) to be of the form given by equation (4.4.18) , the 

X “X 

right-hand side of (4.4.22) becomes 


Etx^kjg^fk) - (k)x^(k) + 2ii' + Ej/( k )v i (k) 

. \ 

+ x. ' (k+l)K. (k+l)x. (k+1) + 2r. ' (k+1) x. (k+1) + s . (k+1) x. (k) } 

“X —X “X —x “X X — X 

= + trg.^.tklk) - P i '(k)x i (k) + Ui'OOR.u^k) + E-'W^k) 

+ tiL-ilLM + v. 00 + B.U. (k) ] 'K. (k+1) [A. .x. (k) + v. (k) + B.U. (k) ] 

— xx— x —x — x— x —x — xx— x —x — x— X 

+ trK. (k+l)E. (k+1 Ik) + 2r. ' (k+1) [A. .x. (k) + v. (k) + B.u. (k) ] + s.(k+l) 

— 1 —X ' — 1 — xx— X “X — x— X X 

- trK. (k+1) Z. (k+1 I k+1) (4.4.23) 

— x — x 

where we have used the fact that 


e{x. * (k+l)K. (k+l)x. (k+1) lx. (k)} - [A. .x. (k) + v. (k) + B. u. (k) ] 'K. (k+1) 
—x —x — X — 1 — xx— X "X — X— X “X 

[A. ,x. (k) + v. (k) + B.u.(k)] + trK. (k+1) (£• (k+l|k)-Z. (k+l|k+l)) 

(4.4.24) 

Given v^(k), minimizing (4.4.23) with respect to u^(k) gives 

u. *(k) = -T. -1 (k+l)B. ' [K. (k+1) A. . x. (k) + K. (k+1) v. (k) + r. (k+1) ] 

— X — x —x —x — xx— x —x —x —x 

(4.4.25) 


Denote (4.4.23) with u.* substituted in by W. (x. (k) ,k) . To minimize with 

—x —x —x 

respect to v. (k) we minimize e{w. (x. ( k) ,k}. This gives 
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>. (k) + 2K. (k+1) [A. .x. Ck) + B.u.*(k)] + 2K. (k+1) v. * (k) + 2r. 


where 


u.*(k) = e{ u .* ( k) } 


= - T. ^ (k+1) B . ' [K. (k+1) A. .x. (k) + K. (k+1) v. (k) 
— x — l — 1 — xi — i —a —i 


Substituting equation 4.4.27 into equation 4.4.26 we have 


[I - K. (k+l)B.T. -1 (k+l)B. * ]K. (k+l)v.*(k) 


= -[I - K. (k+l)B.T. _1 (k+l)B. ’] [K. (k+1) A. .x. (k) + r. (k+1) ] 


Since K. (k+1) is invertible (see Appendix B) 


S. (k+1) = K. (k+1) - K. (k+l)B.T. _1 (k+l)B. *K. (k+1) 


= [K. _1 (k+1) + B .R _1 B. ’l” 1 


[I-K. (k+l)B.T. ^{k+l)B.'] = S.(k+1)K. ^(k+1) is then invertible. 


v\ * (k) = ~A ii i£ i (k) - k/" 1 (k+1) r\ (k+1) - J S.' 1 (k+1)£.()0 

u.*(k) = - T. _1 (k+1)B. ’ (K. (k+l)A. . (x. (k) - x. (k)) 

— 1 1 “1 —x T.1 ”1 “"X 

- i (I - K. (k+l)B.T." 1 (k+l)B. , )‘ 1 £. (k) ] 

2 — -n — l— x —l *-x 

It can be shown that (see Appendix B) 


T. _1 (k+1)B. 1 (I - K. (k+l)B.T.~ 1 (k+l)B. , )” 1 ■ R. _1 B.‘ 


Thus 


u. * (k) = - T. _1 (k+1)B. 'K. (k+l)A. . (x. (k) -x.(k)) + -jk. 

— " — “ — “ ~ “ * X —X 2—1 


(k+1) = 0 
(4^4.26) 


+ r. (k+1) ] 

-~x 

(4.4.27) 


‘ 2 El (W 

(4.4.28) 

(4.4.29) 

Thus 

(4.4.30) 

(4.4.31) 

(4.4.32) 

'Vei 00 


(4.4.33) 
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By s institution into equation 4.4.22 and identifying the terms quadratic 
in x. (k) , linear in x. (k) and independent of x. (k) we obtain equations for 

1 "TL — 1 


K, (k) , s . (k) as well as 

■L 1 


r. (k) =-^-p.( k ) + A. .[I. - K. (k+l)B.T.~ 1 (k+l)B. ‘]r. (k+1) 
—x 2 *1 — XX — X “1 — 1— X — X “TL 

+ A. . 'S. (k+1) v.*(k) 


r ± (T) = 0 


(4.4.34) 


To find the optimal controls u.*(k) and the optimal "estimates" v.*(k), 

— X — 1 

a two-point boundary value problem has to be solved. This involves equations 
4.4.30, 4.4.34, and the following equation 


x. (k+1) = A. .x. (k) + v.*(k) + B.u.*(k) 


(4.4.35) 


x. (0) given 


From equation 4.4.33 


V (k) -SSl'Vei 


(k) 


(4.4.36) 


Substitution of (4.4.30) and (4.4.36) into (4.4.35) yields 
x. (k+1) = -K." 1 (k+1) [r. (k+1) + \ £. (k)] 

“1 — X — 1 2 *"X 


(4.4.37) 


From these we obtain equations 4.4.16 and 4.4.17. Substitution of (4.4.16) 
into (4.4.34) yields (4.3.13). Substitution of (4.4.17) into (4.4.34) gives 

III - i ii , S i (wi)£ i i£i' 1(k)1 ii <k > - - 5 Ei‘ k > - 5 ili’Sitt' 

+ -i- A. . ‘S. (k+1) A. .K. -1 (k) p. (k-1) (4.4.38) 

2 — xi — x — ax— x 

Since from the Riccati equation 

I. - A. . *S. (k+l)A. .K."* 1 (k) = Q.K. -1 (k) 

—1 —11 —1 — 11—1 ■*!— 1 


(4.4.39) 
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we obtain equation 4.3.14. Essentially, the two-point boundary value problem 

is uncoupled and becomes a single equation in r. (k) . r. (k) is uniquely 

— 1 — 1 

defined when is positive definite. This is a sufficient condition for 

the positive definiteness of K. (k) , k=0,...,T-l. Q.E.D. 

— 1 

The control u^*(k) which is actually applied by the ith controller 
consists of two parts: a closed loop part which depends on the measurements 
and an open loop part which does not. The closed loop part can be written 
to depend on the difference between the a priori and the a posteriori esti- 
mates of the ith controller about the state of the ith subsystem. It looks 
like the solution of a tracking problem with x. (k) as the reference state. 

In fact, the optimal cost to go V^(x^(k) ,k) has a form similar to that of 
the tracking problem. The open loop part depends only on £, the coordinating 
signals received from the higher level . When the a priori and a posteriori 
estimates of the local controllers are the same, as in the case of no 
measurements, the closed loop part disappears and only the open loop control 
remains. In the next section we will find out what the open loop part 


really is. 
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5. Solution of the Higher Level Problem 

The higher level problem is choosing the optimal £* to maximize 

N 

J*(p) = l J *(£) 
i-1 1 

Prom Section 4, 

J*(£) = x. ' (0)K. (0)x. (0) + 2r. ' (0)x. (0) + s . (0) (4.5.1) 

1 “"X ""X —x — X — X 1 

where r^(0) satisfies equation 4.3.13 and s^(0) is given by equation 4.4.19. 
Since x. ' (0)K. (0)x. (0) is independent of £, the higher level problem is 


N 


Max T 2r. ' (0)x. (0) + s. (0) 
£ i-1 ^ ^ 1 


(4.5.2) 


Let 


:(k) A 


K(k) A 


r^k) 




K^(k) 


2 . = 


0 • • • • 
0 . 2,2 • • • • 


2^1 


—2 ( k) 


V k) -J 


A A 


^11 —12 
^21 —22 


^NN -I 


<11 

ml 

r*i 

£ . . . . 

£(k) A 

"^(k) 0 .... 


0 

B .... 
—2 


Q S 2 (3c) • • • • 


. 

• • • • . 


- V k) ■ 


Then equations 4.3.13 and 4.3.14 become 


r(0) = - ~ A’£(0) - U - £ K _1 (0))K(0)x(0) 


(4.5.4) 
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2. K~ 1 (k)r(k) = - | A' £ (k) + i(I_ - £ K -1 (k))£(k-l) (4.5.5) 

k=l,...,T-l 

T— 1 

s . (0) =1 (s . (k) - s . (k+1) ) +s.(T) 

l “ ' l l l 

k=0 

= £.»(0)A ii 'S i (l)A ii £ i (d) + tr K ± (T)S i (T|T) 

T-l 

+ 1 (tr Q.Z . (k| k) + tr K. (k+1) (Z . (k+l|k) - Z . (k+1 jk+1) ) } 

, ^ '“ 1“ ,, “1 T — 1 “1 

k=0 

T-l 

+1 {r. ’ (k) tK." 1 (k)A,. 'S. (k+l)A. .K.~ 1 (k) - K." 1 (k)]r. (k) 

— a . “X —11 —1 — 11—1 —1 —1 

+ £i ' <*-l) tii' 1 'Si Oc+DA.^" 1 (k) - Kj" 1 <k>] ^ <k> 

+ i^'fk-l) [K i ' 1 <k)A ii 'S i (k + X)A ii K i ' 1 (k) - S 1 " l (k)l Ei (k-l) } 

- r. ' (T)K. _1 (T)r. (T) - p.'(M)K.* 1 (T)r,(T) 

1 —1 1 **1 ” 1 —1 

- | V (T-DS i " 1 (T)£ i (T-l) (4.5.6) 

The terms involving Z^(k|k) and Z^(k+l|k) are independent of £. Thus 
the quantity to be maximized is 
T-l 

2r(0)x(0) + l - r' (k)K _1 (k)£ ff 1 (k)r(k) - £' (k- 1 ) K -1 (k)£ |f 1 (k) r(k) 
k=l 

- - £* (k-1) [K _1 (k)2 K _1 (k) + B R -1 B 1 ] £(k-l) - •“ £(T-l)i -1 (T)£(T-l) 

( 4 . 5 . 7 ) 


Redefining 

A_(k) = - £(k-l) k=l , . . . ,T 


we have 
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Max X* (1) A x(0) + l - r ' (k)K -1 (k)2 K _1 (k)r(k) + X' (k)K _1 (k)£ Jf 1 (k)r(k) 

k=l 

- (k) [K _1 (k)2 K _1 (k) + B R -1 B' ]X(k) - j (T)s“ 1 (T)X (T) (4.5.8) 

with respect to 

r/k) ; k=l,...,T-l 

X^(k) ; k=l r . . . ,T 

such that 

2 K _1 (k)r(k) =jA'X(k+l) - - 2 |“ 1 (k)] X(k) 

k=l , . . . ,T-1 (4.5.9) 

Theorem 4.5.1: 

The optimal solution X_* (k) to equations 4.5.8 and 4.5.9 corresponds to the 
costates of the deterministic linear regulator problem for the entire 
system. Minimize 

T-l 

x' (T)F x(T) + l x' (k) 2 2L< k > +u' 00 Ru(k) (4.5.10) 

k=0 

subject to sc(k+l) ** A x(k) + JB ju(k) 

x(0) = x(0) given (4.5.11) 

Proof ; 

We form the Lagrangian H (X , j: , a) given by 

_ T-l 

K(X,r,a) = X ' (l)Ax(O) +1 -r'(k)K -1 (k)2 K _1 (k)r(k) 

k=0 

+ X’ (k)K _1 (k)2 K _1 (k)r(k) - j X 1 1 (k) [K _1 (k)2 K _1 (k)+B R _1 B' ] ^(k) 

- a’ (k) [2 K~ 1 (k) r(k)- j A*X(k+l) + ^(1 - 2 K _1 (k) )X(k) ] 

- j X’ (T)S _1 (T)X(T) 


(4.5.12) 
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Using the necessary conditions for optimality we obtain 

|x7l) = A x(0) + K -1 (l)2 K -1 (l)r(l) - j[K _1 (l)2 B." 1 ( 1 ) + B R -1 B'] X(l) 

- -j[I - 2 K -1 (l) ] ' a(l) =0 (4.5.13) 

|f(k) = K _1 (k)£ K -1 (k) r(k) - jiff 1 (k)£ K _1 (k) +Br" 1 B , ]X()c) + \ A a(k-l) 

- i[I_ - £ K _1 (k) ] 'a(k) = 0 k=2,...,T-l (4.5.14) 

|?7T) s 2- - ( T-D " ir 1(T) - (T) “ - (4.5.15) 

l^-j =-2K _1 (k)2 K* 1 (k)r(k) + K _1 (k)2 K _1 (k)X(k) - K _1 (k)2 a(k) =0 

k=l,...,T-l (4.5.16) 


3h —1 1 1 — "i 

- - 2 I (k)r(k) + | A' X(k+1) - -[I - £ i (k) ] X(k) = 0 

k=l,...,T-l (4.6.17) 

From (4.5.13) and (4.5.16 ) , we obtain 

a(l) = 2 A x(0) - B R _1 B'X (1) (4.5.18) 

From (4.5.14) and (4.5.16 ) , we obtain 

a(k) = A CC(k-l) - |b R-^B’Mk) k=2, ... ,T-1 (4.5.19) 

Since 

i~ 1 (T) = F -1 + B R _1 B' (4.5.20) 

equation 4.5.15 becomes 

(F _1 + B R _1 B' ) X_(T) = A a(T-l) 

X (T) = F [A a(T-l) - B R - ^B ' X (T) 1 


(4.5.21) 
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From (4.5.17) and (4.5.16), we obtain 

X (k) = A'X (k+1) + £ a(k) 


Let 


a(k) A 2 x(k) 

Then we have 

x(k+l) = A x(k) - B r” Vx (k+1) k=0 , . . . ,T-1 

Mk) = A'X (k+1) +22 x(k) 


(4.5.22) 


(4.5.23) 

(4.5.24) 


x(0) = x(0) 

X(T) = 2 F x(T) (4.5.25) 

This is the two-point boundary value problem associated with the 
optimal control problem (4.5.10) and (4.5.11) [A3]. Q.E.D. 

Since Mk) = 2 K(k) jc(k) where K(k) is the solution of the Riccati 
equation for the whole system 


K(k) = £ + A' K(k+1) “ B T” 1 (k+l)B'K(k+l) ] A 


K(T) = F (4.5.26) 


T(k+1) = R + B'K(k+l)B 


(4.5.27) 


the optimal control u^* given the optimal coordinating signal £*(k) is 

u. * (k) = - T. _1 (k+1)B. 'K. (k+1) A. . (x. (k)-sc. (k) ) - ^ R. _1 B . ' X . * (k+1) 

= - T . _1 (k+1) B. 'K. (k+1) A. . (x. (k) -X. (k) ) - [R _1 B 'X* (k+l)l . 

—x —x —x —xx —x —x 2 — X 

(4.5.28) 

where [a] ^ corresponds to the ith component of vector a. 



-97- 


We now show how x. (k) is related to the solution of the deterministic 
—a. 

linear regulator problem of the entire system. 

Theorem 4.5.2 : 

Given the optimal coordinating parameters, the unconditioned estimates 
3 t(k) of the state of the system by the lower level (given by equation (4.4.35)) 
are equal to the unconditional estimate of the coordinator, i.e., 

x(k+l) = A x(k) - |b R -1 B' X* (k+1) 

x(0) given (4.5.29) 


Proof : 


By equation 4.4.30 


- 1 , 


v. * (0) — - A. . x. (0) - K. ~ *(1) r. (1) . + “(K. _1 (l) + B.R.“V ')X* (1) 


(4.5.30) 


By equation 4.5.16 


Thus 


(1) +tK.' 1 (1)X*(1) -x. (1) = 0 

— 1 — X 2 “1 — X — 1 — 


v. *(0) = - A. . x. (0) + x. (1) + i B.R.-.V '**(1) 

2 “ TL —1 —1 “1 


(4.5.31) 


N 


“ A. .X. (0) + £ A. (0) 

j=l 


= I ^^(0) 


(4.5.32) 


We then have 


x(l) =Ax(0) - B R _1 B , X*(1) 


(4.5.33) 


By induction, we can easily show that 



-98- 


v. * (k) = Y A. .x. (k) (4.5.34) 

and hence equation 4.5.29. Q.E.D. 

We have thus verified constraint (4.2.10). Moreover, we have shown 
that the unconditional mean (a priori estimate) of the ith controller given 
the optimal coordinating parameter and the uncoupled subsystem is the same 
as the a priori estimate obtained by the coordinator. The optimal control 
u.^* (k) is given by 

u. * (k) = - T.“ 1 (k+1)B.K. (k+l)A. . (x. (k)-x. (k) ) - [ T _1 (k+l)B’K(k+l) A x(k) ] . 

—a —i — x—x —xi —x —x — x 

(4.5.35) 

where 

T(k+1) = R + B'Kfk+DB (4.5.36) 

This optimal control consists of two parts, a closed loop part which 
has been discussed before and an open loop part. The open loop part is the 
optimal deterministic control for the whole system assuming no measurements 
are made. Thus the optimal control u^* (k) has a deterministic component 
which takes into account the effect of the coupling and a closed-loop part 
which utilizes the local information available. The closed loop part resem- 
bles the solution of a tracking problem where the a priori estimate by the 


coordinator is the reference state. 
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6. Discussion and Perspectives 

We have obtained an off-line decomposition of the linear-quadratic- 
Gaussian problem. It is found that the optimal control strategy consists 
of two parts: a closed- loop part which can be generated by the lower level 

controller himself and an open- loop part which depends on the coordinating 
parameter £. The closed-loop part consists of the optimal deterministic 
gain for the ith subsystem acting on the difference of two estimates. The 
optimal coordinating parameter £ is essentially the costate corresponding 
to the optimal deterministic control of the entire system using the mean of 
x(0) as its initial state. Then the open loop part is the optimal deter- 
ministic control of the whole system. The scheme of control is simpler than 
the solution to the optimal dynamic team since it requires less on-line 
and off-line computation. Compared with the centralized case, when there 
is communication among all the controllers, it is also simpler since a full 
dimensional Kalman-Bucy filter has been replaced by N local filters. The 
decrease in computation and communication is accompanied by a loss in 


mathematical optimality. 



CHAPTER 5 


DECOMPOSITION OF STOCHASTIC DYNAMIC SYSTEMS (ON-LINE) 

1. Introduction 

In this chapter we study the on-line decomposition of stochastic 
dynamic systems. The off-line decomposition of stochastic dynamic systems 
has been considered in the previous two chapters. Based on the a priori 
information, the coordinator transmits coordinating parameters to the lower 
level controllers. With the optimal choice of these parameters, the system 
is coordinated in the sense that the action of the other subsystems on the 
ith subsystem and its estimate by the ith controller are equal on the 
average. Once the system starts running, the coordinator's duty is finished. 

In some situations, the coordinator receives new information while 
the system is running. This new information can be used to improve the 
performance of the system. Instead of off-line coordination, we thus have 
on-line periodic coordination with the coordinator processing the new 
information and transmitting new coordinating parameters. Two kinds of 
on-line coordination will be considered in this chapter, depending on the 
coordinator's assumption about the availability of future information. 

When the coordinator assumes that future information will not be available, 
open- loop feedback optimal coordination is obtained. When future information 
is assumed to be available, then truly closed- loop coordination is obtained. 

Roughly speaking, the issue of periodic on-line coordination can 
be explained as follows. Each local controller collects all his measurements 
(e.g. once a day). On the basis of his own measurements, coordinating 
signals, etc., he makes his (daily) decisions. Every-so-often 
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(say once a week) each local controller transmits all his measurements 
to the coordinator. The two cases considered correspond to 

(a) the coordinator does not know if and when local 
measurements will be transmitted, hence he 
operates under the pessimistic assumption that 
no further measurements will be made (open loop 
feedback optimal strategy) . 

(b) the coordinator knows a priori that he will 
receive periodically all measurements, and his 
coordinating strategy reflects the knowledge 
that in the future he will receive such 
measurements (the closed loop case) . 

In the next section we formulate the on-line coordination problem 
when the coordinator makes measurements after the system starts running. 

The open loop feedback optimal concept in stochastic control is applied 
to coordination in Section 3. In Section 4, the open loop feedback 
optimal coordination of the LQG problem is investigated. The solutions 
are found to be rather physically intuitive. In Section 5, we study the 
truly closed loop mode of coordination. A functional equation which has 
to be solved is obtained. This is compared with the open loop feedback 
optimal case. In Section 6 a special linear dynamic team problem consisting 
of independent subsystems with the only coupling in the terminal cost is 
considered, This is used in Section 7 to obtain a decomposition for the 
lower level problem between updating times for the closed loop optimal 
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coordination of the linear-quadratic-Gaussian problem. The resulting 
control strategies are very similar in form to those obtained from open loop 
feedback optimal case. The difference between open loop feedback optimal 
coordination and closed loop optimal coordination is discussed. 
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2. Formulation of the On-Line Coordination Problem 

Let the coordinator collect the measurements of all the lower level 
controllers periodically every t units of time. For simplicity, we assume 
T = m£, where m is some integer. 

Let Yq ( k) be the information available to the coordinator at time k. 

Then 

Y Q (k£) = Y 0 (k£+1) = ... - Y 0 (k£+£-l) 

= {Y i (kt), ILOce-l); i=l,...,N> 

k = 0,...,m - 1 (5.2.1) 


The control of the ith controller is allowed to depend on Y^ (k) as well 

as Y^(k) and U^(k-l). We shall show that in certain cases Yg(k) can be 

replaced by some sufficient statistics. Given his available information 

N 

Y_(k), the coordinator requires v. (t) to be equal to ) g. .(x.(t)) on the 
0 -x j^i^O “3 

average, where t > k. Thus we have the following formulation. 


Problem 5.1 : 
Given 


x. (k+1) » f. (x. (k) , v. (k) , u. (k) , (k) ) 

— X —X — 1 —1 — x 

x — 1 / • • • fN 

N 


T-l 

J. = e{k. (x.(T)) + J L (x.(k), u. (k))} 

1 1 ^ k=o 1 - 1 

Etv^t) ~l c^.OMt)) |Y Q (k)} = £ t>k 
j^i J “ 


(5.2.2) 


(5.2.3) 
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k = 0,...,T - 1 



««. 

• 

• 

• 

H 

II 

•H 

(5.2.4) 

u^k) = I± k (Y i (k), 

^(k-l), Y q (k) ; I) 

(5.2.5) 

v, (k) = n» k(Y n (k ) ; 

— 1 — 1 U 

I) 

(5.2.6) 


where I has the same interpretation as in Section 3.3. Find optimal control 
k k 

strategies and , i = 1,...,N; k = 0,...,T - 1 such that J is 
minimized. 

Comparing with the off-line case discussed in Chapter 3, we see that 
the constraint has to be satisfied more exactly. In the previous case, the 
constraint is only required to be satisfied with respect to the a priori 
information. Now it has to be satisfied with respect to the updated 
information of the coordinator. By the nested property of the conditional 
expectation, it is easily seen that equation 5.2.4 implies equation 3.3.3 
since 

N N 

E{ v^ ( t) - J 2*. (x.(t))} = EfeCv^t) - J 2 .. (x. (t) ) I y q (K) }} 
jj*i 3 ” jl*i 3 ’" 3 

t > k (6.2.7) 

Whether it is off-line coordination as discussed in Chapter 3 or 
on-line coordination treated here, the lower level controls u. (k) are all 
closed loop, i.e., they depend on the past information available and are 
computed based on the assumption that further measurements will be made. 

The terms "off-line" and "on-line" refer only to the mode of coordination. 
For on-line coordination, there are many possibilities. We shall consider 
two here, open-loop feedback optimal coordination and closed loop optimal 
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coordination, and discuss their similarities and differences in both the 
general, as well as the linear-quadratic-Gaussian case. 
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3. Open-Loop Feedback Optimal Periodic Coordination 

The philosophy of open-loop feedback optimal controls is essentially 
the following [D2, C5, T2]. At each time k in the control interval 

1. The statistics of the state of the system x(k) is 
generated (possibly with a nonlinear filter) from the 
available observations. 

2. Assuming that no measurements will be made in future , the 
optimal control sequence \i*(k) , u* (k+1) , . . . ,u* (T-l) is 
generated based on the currently available data by solving 
an open- loop control problem, the cost functional being 
the cost to go from time k conditioned on the data 
available at time k. 

3. The optimal control sequence is applied from time k to 
time k' when additional measurements are made. Steps 
1, 2 are then repeated to obtain a new sequence of 
optimal controls u* (k' ) , . . . ,u* (T-l) . 

The name open- loop feedback optimal is used because essentially 
an open- loop stochastic control problem is solved at each updating time 
and then the optimal controls are applied in a feedback form. 

Applying this philosophy to the coordination problem, we have 

T-l, 


the following scheme. At each time k = 0 , l, 2 1,..., 
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1. The statistics of the state of the system jc(k) is 
generated (possibly with a nonlinear filter) from 
the available data Y Q (k) by the coordinator. 

2. Assuming that no measurements will be made by the 
coordinator in future, the coordinator then faces a 
problem similar to Problem 3.1 except that the system 
starts from time k and the a priori statistics on 
x(0) is replaced by the conditional density of jc(k) 
given Y Q (k) . 

3. The coordinating parameters £* (k) (T-l) can be 

found. They would define the lower level problems 
from which the optimal control strategies rj^ fc , 

t = k,...,T - 1, i = 1 , . . . ,N are computed by the 
local level. These optimal control strategies are 
applied until t = k + £ - 1 when a new set of data 
Yq ( k+£) is available to the coordinator. The whole 
process is then repeated. 
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Using this approach. Problem 5.1 can be solved by the following 
hierarchical scheme. 


Lowe r Level : 


n£ '<_ k < nt + t n = 0,...,m-l 


x. (k+1) = f. (x. (k) , v. (k) , u. (k) , L (k) ) 

“X — X —X — X — X “X 

(5.3.1) 

u, (k) = X. k (nt) (Y. (k) , U, (k-1) , Y (n£) ; I) 

“X X X 0 

(5.3.2) 

(k) = 2 i k (n£) (Y Q (n-0 ; I) 

(5.3.3) 

T-l 

J. (£(n£) , Y (n£)) = e{k. ( x. (T) ) + £ L. (x. (k) , u. (k) ) 

1 0 1 ^ k=n£ 1 “ 1 ^ 

(5.3.4) 

+ (k;n£) v^fk) - J £. ' (k (x^ (k) ) 

jfii 3 “ 

|Y 0 (ne)} 


Find ](\ k (n£) and TT^ k (n£) , k nt, i = 1,...,N such that J^(jo(n£) , Y^(nt)) is 
minimized. 

£(n£) ={£(k;ne)}^ (5.3.5) 


Let 


J *(£(ne>, Y (nt)) = Min J. ( E (n t) , Y (nt)) 


(5.3.6) 


Higher Level : 

N 

Maximize T J.*(£(n£), Y (n t)) (5.3.7) 

£(n£) i=l 1 0 


£(n l) = £<Y 0 (n l)) (5.3.8) 

where $_ is some measurable function of Y^(nt ) . 

Apart from solving the maximization problem, the coordinator is also 
responsible for generating the conditional density of x^(n£) given Y Q (n£) 
which is necessary for the definition of the lower level problems. In 
addition to £*(n£), this probability has also to be transmitted. For the 
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~Jc k 

lower level controllers , although the optimal ](\ (n t) and (n£) ; k >_ 

k k 

are computed, only Y. (n£) and ry. (n£) with n£ < k < n£ + L are used in the 
— x —i — 

actual control . 

We notice that the open- loop problem has to be solved at each up- 
dating time by the coordinator and then the stochastic control problem solved 
by the lower level controllers. Depending on the nature of the lower and 
higher level problems, this open- loop feedback optimal strategy may or may 
not be feasible. When the updating interval is very long, then the com- 
putations involved may become manageable. Again, analytical results can be 
obtained for the linear-quadratic-Gaussian case. This is discussed in the 


next section 
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4. Open Loop Feedback Optimal Coordination of the Linear-Quadratic- 

Gauss ian Problem 

In Chapter 4, we obtained the optimal control strategies for the 
linear-quadratic-Gaussian case when off-line coordination is assumed. This 
will be used to obtain the open loop feedback optimal coordination of the 
linear-quadratic-Gaussian problem. 

In Section 4.3, equation (4.3.1) gives 

u^tk) = -D^k+1) (^(kj-x^k)) - E^k+D^Mk) (5.4.1) 

where x^(k) is the estimate generated by the ith controller using his 
a priori information and the coordinating parameters £*(k)'s. x^(k) is 

generated using the measurements in addition. They sure given by equations 
(4.3.7) and (4.3.10). 

Using the open loop feedback optimal philosophy, we have the 
following. 

At any time k, let n Z be the last updating time. Thus 

nt £ k < nt + t (5.4.2) 

Then 

u^tk) = -D^k+1) (^ (Ic | k) (k | rve) ) - E^k+D^Mkjn*) (5.4.3) 

where the gain matrices are the same as those given in Chapter 4. 

(k|k) = E{3t i (k)| Y^k), (k— 1) ; 1^, £*(n£), (n£|n£) ,1^ (n£|n£) } 

(5.4.4) 

(k |n£) = E{x i (k)|l i , E *(n l) , x ± (n^|n£)} (5.4.5) 

1^ is the decentralized a priori information of the ith controller 
about his subsystem, x, (k |n£) is the estimate generated by the ith 
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controller based on the coordinating parameters £*(n t) and x_^(n£|n£). 
x^(k|k) is the decentralized estimate generated by the ith controller 
using the coordinating parameters jd* (n£) , the statistics x^(n£.|n£) and 
E^(n£|n£), and the data ¥^(k), U^(k-l). 

The state estimate of the coordinator is generated as follows. 

x(k+l|k+l) « A x(k|k) + B u (k) + G(k+lJ [^(k+l)-C(A x(k|k}+ B u (k) J 

x(o|0) = x(0) (5.4.6) 

G(k+1) - Z(k+l|k+l)C'0“ 1 (k+l) (5.4.7) 

E(k+l|k+l) = [C'0" l (k+l)C + ( A £(k|k)A'+ H(k)) -1 ]" 1 

E(0|0) - E(0) - E(0) C' (CE(O)C' + 0(0) ) -1 C E (0) 

(5.4.8) 

The coordinating parameters j>*(k;n£) are given by 
£*(k;n£) = - X*(k+l,-n£) - - 2 K(k+1) x(k+l|n£) (5.4.9) 

where 

x(k+l|n£) « (I - B T -1 (k+l)B'K(k+l))A x(k|n£) (5.4.10) 

]c(n£|n£) is given by (5.4.6). 

The decentralized state estimates x^(k|k) and x^(k|n£) are found as 
follows. 

^(k+llk+1) - A^x^kjk) + v ± *(k|n£)+ ^^(k) + G^k+lloO [^(k+1) 

-C^ (^ii^i (k |k) + v i # (k|n£) + (lc) 1 (5.4.11) 

x^ (n£ | n£) ■ x^ (n£ | n£) (5.4.12) 

G^ (k+1 J n£) = E i (k+lJk+ljn£)C i 'C 1 (k+l) 


(5.4.13) 
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(k+1 |k+l jn£) 




<5i A (k|k;n£) A^+ 


H, (k))" 1 ]" 1 


(5.4.14) 

T^(nt\nt;nt) » E. i (n£|n£) (5.4.15) 

^(k+lln l) m - K^ -1 (k+l)[ (k+1 1 nt) + yg^Mkjad)] (5.4.16) 

^•(kln-C) and r^kln-t) are given by the following equations. 
v*(n£|n£) ■ - ( n ^ I n< £) - K i ~ 1 (n^+l)r i (n-C+l|n£) 

- |s i ' 1 (n^l)£ 1 *(irfini) (5.4.17) 

^(kjnt) “ A i .K i _1 (k)(r i (k|n£) + \ Z^lk-Unt)) 

-K i _1 (k+ 1 )r i (k+l|n£) - j S i _1 (k+l)g* (k;n£) 

n l < k < n t + l (5.4.18) 

r^lnt\nt) ■ - nt) - y 

’ hli§^ <n£+l) I *^) (5.4.19) 

2 i K i " 1 (k)r i (k|n£) - - j £^* (k;n£) - j Aj^ i£i * (k;n£) 

n l < k < nt + l (5.4.20) 

r^ (T | nt) = 0 (5.4.21) 


Essentially at each updating time nt, the coordinator evaluates a 
state estimate x (n£|n£) and new covariance jMn£|n£) using his measurements . 
From these the optimal coordinating parameters £* (k;n£) are computed. 
x^(n£|n£) , E^^(n£|n£) and £*(k;n£), k « n , . . . . ,n£+£-l, jure transmitted 
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to the ith lower level controller. 

Die ith lower level controller generates the new a priori estimate 

x^( k|n£) from the coordinating parameters as well as the estimate x^(k|k) 

from his own data. From Dieorem 4.5.2 we can see that x. (k|n£) is the same 

— 3 * 

as that generated by the coordinator using equation (5.4.10). Die optimal 
control strategy given by equation (5.4.1) consists of a part depending 
on the difference between these two estimates and a part specified by the 
coordinator through the coordinating parameters . When these two estimates 
are the same, essentially the coordinator takes over the control of the 
system. This can happen when the coordinator updates his information as 
often as the lower level controllers. In this case from Appendix D, 

tuMk) = - E i (k+l)£ 1 *(k,k) 

- - [T _1 (k+l)B'K(k+l)A x(k|k)J i (5.4.22) 

Thus the control strategy approaches that given by a separation theorem 
asymptotically. 

Remark : 

The linear-quadratic-Gaussian problem is really a very special case. 
We note that for open loop feedback optimal coordination , the lower level 
controllers do not need the whole set of coordinating parameters from time 
nt to T-l. in general, this may not be the case. Recall also that for 
centralized information structure, there is no difference between the open 
loop feedback optimal control and the closed loop optimal control for the 
LQG problem. In the case of coordination, however, there will be a 
difference, as can be seen from the next section. The fact that the 
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coordinator assumes the availability of future information introduces some 
features which are not present in the open loop feedback optimal cue. 



-115- 


5. Closed-Loop Optimal Periodic Coordination 

In closed- loop coordination, the coordinator chooses his 
coordinating parameters knowing that measurements will be made in the 
future. This has both advantages and disadvantages. Instead of having 
a different set of coordinating parameters which are changed at each 
updating time, only one set of coordinating parameters will be required 
but the future ones depend on measurements still not yet available to the 
coordinator. Complete decoupling of the lower level problems is not ob- 
tained in this case, however. The following theorem uses the closed loop 
nature of the coordination to reduce Problem 5.1 to another problem. 

Theorem 5.5.1 : 

If closed-loop periodic coordination is used and the coordinator has 
perfect memory, the Problem 5.1 is equivalent to the following problem. 


Problem 5.2 : 
Given 


x^k+l) = f^OO, v.(k), u^k), ^(k)) 


X — X f • • • f N 

(5.5.1) 

N 

j = y j. 

i-i 1 

(5.5.2) 

T— 1 

J. = E{K. (x. (T) ) + l L. (x. (k) , u. (k) ) > 
x x -a £ =0 i-i 

(5.5.3) 

Etv^t) - l ^ (*.(t))|Y (k) > = 0 
j?*i 3 


k+£>t>k k = 0, l ,. . .,T - l 


i = 1, . . . ,N 

(5.5.4) 
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u. (k) -I, k (Y.(k), u. (k-1) , Y (k ); I) (5.5.5) 

1 “T. 11 U 

v. (k) = l ± k (Y 0 (k) j I) (5.5.6)' 

k k 

Find optimal ^ and , l ■ 1,...,N; k = 0,...,T - 1 such that J is 
minimized. 


Proof : 

We need only to show that constraint (5.2.4) is equivalent to constraint 
(5.5.4). 


Equation (5.2.4) obviously implies equation (5.5.4). 

Suppose 

E{v i (t) - l SijUjtt)) |Y Q (k)} * 0 

for all k + £> t > kj k - 0, t - 

Then consider 

(t) - 2i j ) I Y 0 < k > > 

where t k + t. 

Since measurements will be made in future and closed loop coordina- 
tion is used, there exists an updating time k' such that k' + & > t >_ k' 
and 


e{v. ( t) - l a- .(x.(t))|Y n (k’)} = 0 

-l -to ~~3 0 — 


(5.5.7) 


Thus 

E(v^(t) - l ^.(^(t)) |Y Q (k)} = Efetv^t) 
j^i 3 - 1 


-l a.4(i.(t))|Y (k ,} }|Y (k)}-0 
j^i J 


(5.5.8) 


The first equality follows from the perfect memory of the coordin- 


ator and the nested property of the conditional expectation. 


Q*E. D* 
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In Section 3.4 we considered a constrained stochastic control 
problem and used it to obtain an off-line decomposition for coupled 
dynamic systems. There the constraint is with respect to the uncondi- 
tional mean and hence the decomposition is off-line. To obtain an on-line 
decomposition which is closed loop optimal, we have to consider constraints 
which are conditioned on on-line measurements. Corresponding to Problem 
3. 3 we thus have the following problem. 


Problem 5.3 

System: x(k+l) = _f (x(k) , v(k) , \i(k) , £(k) ) 

E{H(x(t) , v(t))|Y 0 (k)} = 0 

k + £ > t > k , k = 0, l, 2 1,... 

Measurement: 2 .(k) « h(x(k) , 6_(k>) 

T-l 

Cost Functional: J = e{k(x(T)) + £ L(x(k) , u(k) ) } 

k=0 


(5.5.9) 


(5.5.10) 

(5.5.11) 

(5.5.12) 


Ji(k) , 6.(k) , k = 0 , . . . ,T - 1 and x(0) are random vectors with known 
statistics . 


Y (k) = {^(0) , . . . ,£(k) ; u(0) , . . . ,u(k-l) > 

Y 0 (ke) = Y Q (k£+l) = ... = Y q ( k£+£-l) = Y(k£) 

k = 0, . . . ,m - 1 

u(k) = X k (Y(k)) 
v(k) =n k (Y 0 (k)) 


(5.5.13) 

(5.5.14) 

(5.5.15) 

(5.5.16) 


It is required to find and such that J is minimized and the 


constraint is satisfied. 
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Corresponding to Theorem 3.4.1 we have the following. 


Theorem 5.5.2 : 

Let VSP{L(a,b)} denote the value of the saddle-point of L where a is the 
a 
b 

minimizing variable and b is the maximizing variable. If the saddle point 
exists in the following functional equation, then the optimal cost for 
Problem 5.4 is given by 

e{V(Y q (0)P) } (5.5.17) 

where V(Y^(k) ,k) satisfies the functional equation 

k+£-l 

V(Y (k)«k) = VSP e{£ L(x(t), u(t)) 

X k , . • • ,X k+l ~ 1 ; v(k) , . . . , v (k+£-l) t=k 
£(k) . . ,£(k+£-l) 


+ £’ (t)H(x(t) , v(t)) + V(Y Q (k +£) , k+£) I Y q ( k) } 

k = 0, £,..., T - l 
V(Y q (T),T) = e{k(x(T)) |y q (T)} 


(5.5.18) 

(5.5.19) 


Proof: 


Define 


V(Yq ( k) ,k) 


= Min 
k 

X - 


T-l 


v(t) e fi 1 (Y Q (t),t), t=k, 
e ft 2 (Y Q (t) ,t) ,t=k +£,. 


= Min 
k 


k+£-l 


v(t) e (Yq (t) , t) , t=k. 


T-l 

e{£ L(x(t) ,u(t))+K(x(T) ) |Y (k) > 
t=k 

. ,k+£-l 
,T-1 

k+£-l 

[e{£ L(x(t) ,u(t) ) | Y (k) } 
t=k 

* 9 


k+£-l 
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+ Min L(x(t) ,u(t) )+K(x(t) ) | 

k +1 T-l t=k+£ 

X ’ — 'X 

e fi 2 (Y o (t) ,t} ' t=:k+ ^'* •• ' T-1 {5 - 


fi 1 (Y 0 (t) ,t) = {v|E{H(x(t) ,v) |Y Q (t)} = 0} (5. 

S (Y o (t),t) ^nl E ^iL ( 2L< fc ) »n(Y 0 ( t )) | Y q ( t) } = 0} (5. 

Prom Appendix A, the second term on the right hand side of (5.5.20) to 
minimized is e{v(Y q ( k+^) ,k+l) | Y Q (k) > . Thus 

k+£-l 

V(Y (k),k) ® Min e{J L(x(t) ,u(t)) 

k k+£-l t=k 

X • • * • »X 

v(t) £ f^CYgft) ,t) , t=k, . . . ,k+£-l 

+ V(Y q ( k+£) ,k+£) | Y q ( k) > ( 5 . 

Form the Lagrangian for equation (5.5.23) as 

k+£-l 

E{£ L(x(t) ,u(t) ) + £' (t)H(x(t) ,v(t)) + V(Y 0 (k+^) ,k +£) |Y Q (k) } (5. 


If a saddle point exists , then V(y^(k) ,k) is given by equation 

(5.5.18). By the same argument as in Theorem 4.3.1, 

T-l 


V(Y Q (0),0) 


Min 

x k . 

n k e Q. 


k=0 


, . . . ,T-1 
(Y Q (k) ,k) , 


e{][ L(x(k) ,u(k)) + K(x(T))} 
k=0 


It is ^thus the optimal cost for Problem 5.3. Q 

By the property of the saddle point, the following corollary is 


* 0 (k)}] 

5.20) 

5.21) 

5.22) 
be 


5.23) 

5.24) 


5.25) 

.E.D. 


obvious . 
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Corollary 5.5.3 : 

If the optimal Lagrange multipliers £*(k) = (Y Q (k) ) are known, then 

k+£-l 

V(Y (k) ,k) = Min e{£ L(x(t) ,u(t)) + £*' (t)H(x(t) ,v(t)) 

k k+-£-l t=k _ 

X • • * * »X 

v(k) , v(k+£-l) 

+ V(Y Q (k+£) ,k+£)|Y Q (k)} k = 0, £,..., T - l (5.5.26) 

V(Y q (T),T) = e{K(x(T)) |y q (T)} (5.5.27) 

Remark : 

It is also possible to substitute the VSP operation in Theorem 5.5.2 with 
maxmin or minmax. 

If we apply the corollary to Problem 5.2, we will find that the 
terms involving L(x(t),u(t)) and £*' (t)H(x(t) , v(t) ) are separable into 
the subsystems. However, V(Y^ (k+£) ,k+£) cannot be separated into a sum of 
N independent parts. Thus, although we have an optimal stochastic control 
problem consisting of N uncoupled subsystems, the cost functional, which 
is essentially separable has a terminal term which is not separable . This 
makes our problem of finding the optimal controls considerably harder than 
the open loop feedback optimal case. The main difference between the two 
cases lies in the assumption of future measurements by the coordinator. 

When future measurements are assumed to be made, this fact will be made 
use of by the lower level controllers as well as by the coordinator. 

Although (5.5.26) is not as simple as the lower level problem in 
Section 3, it is simpler than the original problem with decentralized a 
posteriori information and common a priori information. Moreover, only 
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one sequence of jd' s need to be chosen by the coordinator. In the open 
loop feedback optimal case, a different sequence of s have to be 
chosen at each updating time. This equation is quite nontrivial to 
solve. In the next two sections, we shall show how this equation can 
be solved for the linear-quadratic-Gaussian case. 
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6 . A Special Linear Dynamic Team Problem 


In this section we shall consider a special linear dynamic 
team. The solution of this team problem will be used in the next 
section to obtain the closed loop optimal coordination of the linear- 
quadratic-Gaussian problem. Problem 5.4 stated below is the team 
decision problem under consideration. We shall relate its solution 
to the solutions of Problems 5.5 and 5.6 which are simpler. 

Problem 5.4 : 

x. (k+1) = A. . x . (k) + v. (k) +B.u. (k) + £. (k) (5.6.1) 

—l — li— i — i — -i— i 

N T-l 

J = e{ l l x^' (k)2^x ± (k) + u i , OOR i u.(k) + 
i=l k=0 

- p. ' (k)x.(k) + x' (T)K(T)x(T) } 

-i (k) (k-1) ) 

v^(k) depends only on the a priori information. 


(5.6.2) 

(5.6.3) 


x^(0) given i=l,....,N. 


Find Y . , v. , i=l 

-*-1 — i f 


,N such that J is minimized. 


Remark : This is a special case of a dynamic team. The dynamics of 

the subsystems are uncoupled. The cost functional is essentially 
uncoupled except for the terminal cost. 

Problem 5.5; i=l , . . . . , N 

x. (k+1) = A. .x. (k) + v. (k) + B.u. (k) + £. (k) 

—l li—i —l —l—i —l 


(5.6.4) 
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T— 1 

J. = e{ 7 x. * (k)Q.x. (k) + u. ' (k)R.u. (k) + p. ' (k) v. (k) 

1 l , — l *- 1—1 —l —i—x *-1 — l 

k=0 

- p. '(k)x.(k) + X. ' (T)K. . (T)x. (T) + 2r (T) x . (T) } 
*- 1—1 —l —ii — l l —i 1 

(5.6.5) 

u.( k ) *^ k ( y i(k) r O.(k-l)) (5.6.6) 

(k) depends only on the a priori information 
x^(0) given 

Find Y. * , v.* such that J.is minimized, i=l,....,N. 

-*-1 — l l 

£^(T) is a n^ dimensional vector. 

Remark : Given r^ (T) , we have N uncoupled stochastic control problems 

with linear terms in the terminal costs. 

Problem 5.6 : Same as Problem 5.5. 

Find X.j* * Xi* and £^*( T ) such that J\ given r\* is minimized, i=l,...,N 
and 


r . * (T) = l K. . (T) x.* (T) = £ K. . (T) e{x* (T) } (5.6.7). 

where x* (T) is the resultant optimal trajectory. 

Remark : The solution of Problem 5.6 gives the person-by-person optimal 

(pbpo) strategy of the team. Given a team with payoff function 
J , . . . ,]^) the pbpo strategy (]£ * , . . . . ,]^*) is defined by 


■V> W'Xi'Xi.i*' 


(5.6.8) 


for all i and 
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Theorem 5.6.1 ; X±* ’ —i*' solve Problem 5.4 if and only if 

they solve Problem 5.6. 

Proof : 

Necessity: 

Suppose — i*' i=l , . . . ,N solve Problem 5.4. Then in particular 

N T-l 

E { l l x.*’(k)£.x.*(k) + u *’(k)R.u *(k) + £. ’ (k) v. * (k) 
i=l k=0 

- £i ' 2£i* + ** ' < T ) K (T) x* (T) } 

T-l 

<E { T x. ' (k)Q.x. (k) + u. ' (k)R.u. (k) + p. ' (k)v. (k) 

I — 1 *■ 1—1 —1 — 1—1 “~1 —1 

k=0 

- p’ (k)x. (k) + 2 1 T K. . (T) x . * (T) 

* - 1 . . “3 

+ X. ' (T)K. . (T) X. (T) } 

—X —11 —1 1 




A 

+ E { l l x.*’ (k)2-x.*(k) + u.*' (k)R.u.*(k) 

j^i k=0 “3 “3 


+ p. ' (k)v.*(k) - p* , (k)x.*(k) 
£-3 £-3 “ D 

+ terms independent of x. (T) } 


(5.6.9) 


Thus, by defining r.*(T) = T K. . (T)x.*(T) and subtracting 

j^3 “3 

terms independent of the i*"* 1 subsystem from each side of equation 
(5.6.9) we see that Problem 5.6 is solved. 

Sufficiency: Using the results in [H3] we can reduce Problem 5.4 to 

a static linear team with a quadratic payoff. Reference [Rl] then 
shows that the person-by-person optimal strategy is also the optimal 


strategy for the entire team. 


Q.E.D. 
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We shall now solve Problem 5.5. 

Theorem 5.6.2 ; The solution to Problem 5.5 is given by 

u.*(k) = -D. (k+1) (x. (k) - x . (k) ) - E, (k+l)£. (k) , k=0,...,T-l 

(5.6.10) 

v *(0) = -A.X (0) - K ” 1 (l)r, (1) - \ S." 1 (l)p. (0) (5.6.11) 

— X —XX— 1 —x —x 2 — X **-x 

V *(k) » ^K- -1 0O [r. (k) + 5 £.(k-l)] - -1 (k+l)r. (k+1) 

—X — XX— X —X 2 ^-x —x —X 

- | S^Xk+lJE^k) k=l, . . . ,T-1 (5.6.12) 

where D. , E. , x. (k) , x. (k) are the same as those in Theorem 4.4.2 with 
—x —x — x — X 

K^(T) = (T) and r^(k) is given by 


£i<°> ' Ei(0) - l iii'EitO) - A ii , S i (l)a ii £ 1 <0) (5.6.13) 

£*»> - 5 A.i'Ei®) + 5 6 ii ’S i (k + l)A ii K i - 1 (W Ei (k + i) 


k=l , . . . ,T-1 

r^ (T) given 

Moreover, the optimal cost is 


(5.6.14) 


x. ’ (0)K. (0)x, (0) + 2r. ' (0)x. (0) + S -(0) 
—x — x — x — X —X X 


(5.6.15) 


where s^fk) is given by equation (4.4.19) 

Proof : Since Problem 5.5 and the lower level problem in Chapter 4 

(Problem 4.2) differ only in the terminal cost, the functional equation 
(4.4.22) can also be used here to find the optimal control strategies. 
The terminal condition, however, has to be modified, with r. (T) ^ 0 _. 


Q.E.D. 
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Problem 5.4 can now be solved with the help of Theorem 5.6.1 
and Theorem 5.6.2. 

Theorem 5.6.3 ; The control strategies in Theorem 5.6.2 solve 
Problem 5.4, with 


£(T) = - i;£(T-l) + \ K(T)k”^ (T)jd(T-I) (5.6.16) 

Proof: From equation (4.4.37) 

X. (T) = - K. _1 (T) [r. (T) + \ £. (T-l) ] (5.6.17) 

—1 — 11 —l 2 

If 

r,(T) = l K, (T)x. (T) (5.6.18) 

1 Wi ~ 13 ^ 


then Problem 5.6 and Problem 5.4 have the same solution. 
Define .... 


K(T) 


K i;l (T) 0 ... 
° 


W T > 


Then equation (5.6.17) and (5.6.18) give 

r(T) = - (K(T) - K(T) ) K~ 1 (T) [r(T) + | £(T) ] 

or r(T) ~ ~ \ £(T-1) + | K(T)K _1 (T)£(T-1) 


(5.6.19) 


(5.6.20) 
Q.E .D. 


The special dynamic team given by Problem 5.4 has been con- 
sidered in detail. If the r. (T) 's are chosen appropriately as given 

— 1 

by equation (5.6.16), then the optimal control strategies can be 
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found in a decentralized manner by finding the solution to Problem 5.5. 

We may regard the r. (T) ' s as the coordinating parameters for de- 

— 1 

composing the terminal cost* 
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7. Closer Loop Optimal Coordination of the Linear-Quadratic- 
Gaussian Problem 

In this section we shall look at the closed loop coordination of 
the linear-quadratic-Gaussian case. Using the results of the previous 
section, we shall show that although the functional equation (5.5.38) is 
not separable into the individual subsystems, we can, by introducing an 
extra coordinating parameter, define optimization problems for each of 
the lower level controllers . By solving these problems , it is found 
that closed loop optimal coordination and open loop feedback optimal 
coordination give rise to essentially the same control strategies for 
the lower level controllers. However, the gain matrices in the control 
strategies are different. 

Substituting in the right functions, equation (5.5.18) becomes 
the following: 

N k+£-l 

V(y Q (k) ,k) = Max Min e{ l l 2L ' ^>2^ 

i=l t=k "" 1 

k k+JZ,— 1 

E (kj ••••£. (k+fc-DY » • • *Y 

v(k) , . . . ,v(k+£-l) 


+ u.’ (t)R.u.(t) + p ' (t)v. (t) - p.' (t)x.(t) 

— 1 — 3. 1 *~1 1 — 1. 

+ V (Y q (k+Jl) ,k+fi,) | Y q (k) } k = 0,1,..., T-l (5.7.1) 


V(V q (T) ,T) = E{x' (T)F x(T) |y q (t) > 


(5.7.2) 



We shall show that at each updating time k = 0, l, , T-J,, 
the optimal cost to go as evaluated by the coordinator is given by 

V(Y Q (k), k) - x' (k|k)K(k)x(k|k) + b(k) (5.7.3) 

where x(k|k) is the estimate of the state of the system by the 


coordinator given by equation (5.4.6) 

K(k) is the solution to the Riccati equation for the entire 
system. 

b(k) is a precomputable scalar. 

This will be proved by induction. First, we shall need the 
following lemma. 

Lemma 5.7.1 : Let x(k|k) and Zlk|k) be as defined in Section 4. 

Then for any positive definite matrix M, and k > m, 

E {x* (k|k)M x(k|k)]Y 0 (m) } = E {x' (k)M x(k) | Y Q (m) } - tr M 2_(k|k) 


(5.7.4) 


Proof: 


E {x' (k|k)M x(k|k) I Y (m) } = E {e{x' (k|k)M x(k|k) Y n 0c)} Y-(m)} 


0 '"' ' 1 "0 

(5.7.5) 


But 


E {x‘ (k|k)M x(k|k) I Y q ( k) > = E {x 1 (k)M x(k) | y (k) } - tr M E_(k|k) 

(5.7.6) 

Since £(k|k) is independent of measurements 



The following theorem is related to the maximization problem 


of the coordinator. 

Theorem 5.7.2 : Let J* (£) denote the optimal cost for Problem 5.4. 

Then 

(1) Max J*(£> = x' (0)K(0)x{0) + c(0) (5.7.8) 

£ 

where 

N 

c(0) =1 c . (0) 
i=l 

T-l 

c i (0) = tr K^OjZ^OlO) + J tr^ + A-.'K. (k+l)A.. 


- K.OO)E. (k|k) + tr K. (k+l)E. (k) (5.7.9) 

(2) the optimal j>* (k) is equal to -X* (k+1) which corresponds 
to the costate of the optimal deterministic control 
problem for the entire system 

(3) u*(k) - - D, (k+1) (x. (k) - x* (k) ) 

— i — i — i — 1 

- [T _1 (k+l)B'K(k+l)A x(k)) i (5.7.10) 

Proof : From Theorems 5.6.2 and 5.6.3, the optimal cost is given by 

N _ _ _ 

J*(P) = l x. ' (0)K, (0)x, (0) + 2r*(0)x. (0) + 3.(0) 

£. " —X —1 —1 — 1—1 1 

- r ' (T)x. (T) 

—l —i 

_ __ N 

= x' (0)K(0)x(0) + 2r‘ (0)x(0) + l 3.(0) 

i=l 1 
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s^(0) is given by equation (4,5.6), and £(T) is given by (5.6.16). 

If we retain the terms involving jd, we have the following constrained 
maximization problem, (A(k) is as defined in Chapter 4). 

T— 1 

Max A’ (l)Ax(O) + l - r , (k)K" 1 (k)£ if 1 (k)r(k) 

k=l 

+ X* (k) IT 1 (102, I^Wrlk) 

- | A* (k) [K _1 (k)£ K~ 1 (k) + B R -1 B']A(k) 

- i A_* (T) [S -1 (T) - K* 1 (T) + K _1 (T) ]\_(T) 


with respect to 

r(k) ; k=l r T-l 


l>- 

X 

X 

(1 

H 

• 

• 

• 

• 

HI 

Such that 


£ K 1 (k) r(k) 

= | A'X(k+l) - \ (I - 2K~ 1 (k))X(k) 


(5.7.12) 


k=l,. . . ,T-1 


(5.7.13) 


Carrying out the maximization using Lagrange multipliers, 
we obtain the same equations as in Theorem 4.5.1 except equation 
(4.5.15) which now becomes 


| A a*(T-l) - | (S* 1 (T) - K _1 (T) + K” 1 (T))A*(T) =0 (5.5.14) 


Since 


S _1 (T) = K -1 (T) 


B R^B' 


(5.7.15) 


A a*(T-l) 


(B R _1 B' + K -1 (T))X*(T) = 0 


we have 



This, together with A* (k) , k=l,...,T-l, are the costates correspond- 
ing to the optimal deterministic control problem for the entire 
system with initial state x (0) . 

It can be verified (as in Section 4.5) that x.(k) given in 

— X 

equation (5.6.10) is the unconditional (a priori) estimate by the 
coordinator. Equation (5.6.10) then becomes 

u. * (k) - - D. (k+1) (x. (k) - x . (k) ) - [T _1 (k+l)B'K(k+l)A x (k) ] . 

— 1 — X —1 —1 — — — X 

(5.7.17) 

J* (£) is given by equation (5.7.11). Substituting in the 
optimal values for £(0) , we obtain 

r(0) = ~ \ A'£(0) - (I - £ K -1 (0))K(0)x(0) 

= (K(0) - K(0))x(0) (5.7.18) 

From Appendix C 

N _ _ 

l $.(0) + r’ (T)K (T) [r (T) - i X*(T)] = c(0) + x' (0)K(0)x(0) 
i=l 1 * 

~ E * (0)K(0)x(0) (5.7.19) 

Therefore 

J*(£*) = Max J*(p) 

£ 

= x' (0) [K(0) + 2K(0) - 2K(0) - K(0) - K(0)]x(0) + c(0) 

= x' (0)K(0)x(0) + c (0) 


(5.7.20) 

Q.E.D. 
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With Lemma 5.7.1 and Theorem 5.7.2, equation 5.7.1 can now 
be solved. 

Theorem 5.7.3 : The solution to equation (5.7.1) is given by 

V(Y Q (k),k) = x' (k|k)K(k)x(k|k) + b(k) 

k = 0 ,£,..., T-£ (5.7.21) 

where jc(k|k) is the estimate of the state of the system by the 
coordinator given by equation (5.4.7) . 

K(k) is the solution to the Riccati equation for the 
entire system. 

b(k) = b(k+£) - tr K (k+£) I (k+£ | k+£) + c(k) (5.7.22) 

N k+£-l 

c (k) - l { tr ^ (k | k+£) ^ (k | k) + J tr (£5 + A^l^ ( t+ 1 ; k +1) A^ 

i=l t=k 

- K^tikfcfcnZ^tlt) + tr K^t+lik+ljS^t)} 

(5.7.23) 

(t;k+£) is given by 

K. (t;k+£) = Q. + A. . 'K. (t+l;k+£) [I - B .T . “ 1 (t+l;k+£) B . 'K. (t+l;k+£) ] A. . 

"“X "“i “X X “X ** — X — X — X "“X *“X X 

K. (k+£;k+£) = K. . (k +£) (5.7.24) 


T. (t+l;k+£) = R. + B. 'K. (t+l;k+£)B. (5.7.25) 

“X “X —X —X “X 

b (T) = tr F (t|t) (5.7.26) 

Proof : To solve for V (Yq (T-£) ,T-£) we use Theorem 5.7.2, using the 

statistics generated from Y q (t-£) as the a priori information. From 
equation (5.7.8) we thus obtain 

v(y q (t-£) ,t-£) = x' ( t-£|t-£)k(t-£)x(t-£|t-£) + b(T-£) 


(5.7.27) 
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Suppose 

V (Y q ( k+£) , k +1) = x' (k+£|k+£)K(k+£)x(k+£|k+£) + b(k) (5.7.28) 
Then , using Lemma 5.7.1, equation (5.7.1) becomes 

V(Y Q (k) ,k) = Max Min 

£(k) , . . . ,£(k+£-l) ]£%••• ,X k+ ^ -1 

v(k) , . . . ,v(k+£-l) 

N k+£-l 

E { I I x. ' (tJ^x. (t) + u.' (t)R.u i (t) +£.'(t)v.(t) - £. ' (t) x. (t) 
i=l t=k 

+ x' (k+£)K(k+£)x(k+£) | Y^(k) } 

♦ b(k) - tr K(k+£)Z(k+£|k+£) (5.7.29) 

From Theorem 5.7.2, with the common a priori statistics generated 
by Y^(k) , we thus have equations (5.7.21) to (5.7.25). (Q.E.D.) 

We have demonstrated that at each updating time k, the lower 
level controllers need only to solve a stochastic control problem up 
to the next updating time k+£. However, this problem is not uncoupled 
into N sub-problems even with the coordinating parameters £ supplied 
by the coordinator. To uncouple the sub-problems, the coordinator 
has to send out an extra signal r^(k+£|k) given by 

r.(k+£|k) = l K.. (k+£)x. (k+£|k) (5.7.30) 

1 j^i 13 

where 

x(k+£|k) = E{ x (k+£) | Y q ( k) } (5.7.31) 

and can be generated from equation (5.4.10). 
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Specifically, at the updating time k, the ith controller faces an 
uncoupled problem with the following cost functional. 

k+£-l 

3L(Y 0 (k),k) ■ e{ l (t) + (t)R^u^(t) + Ei*' (tjkjv^t) 

t=k 

- 2 i * t (tjWx^t) + 

+ 2 r^' (k+£|k)x^(k+£) | Y Q (k) } (5.7.31) 

The coordinator generates x(k|k), £(k|k) , £(k+l|k) and £*(t;k), 
t «• k , . . . , k+Z- 1 using equations (5.4.6), (5.7.30) and (5.4.9) and his 
data Y 0 (k) . Moreover, K^(k+£) is also required by the ith controller. 
These are transmitted to the lower level to define their uncoupled 
decision problems specified by equation (5.7.31). 

Using the results of Section 6, the optimal controls for the lower 
level are found to be the following. 

For k £ t < k + l 

u^ft) » - D 1 (t+l?k+£) (^(tjt) - (t|k) ) - ^(t+D^Mtjk) 

(5.7.32) 

where E^(t+1) is the same as that given in Chapter 4. 

D. (t+l;k+£) » T. _1 (t+ljk+£)B. 'K. (t+l;k+£)A. . (5.7.33) 

—1 — X —X —1 —XI 

where (t+l;k+£) and (t+l;k+£) are given by equations (5.7.24) and 
(5.7.25). 

£^(t|t) and x^(t|k) have the same interpretation as in Section 4. 
£^*(t;k) is given by equation (5.4.9). Comparing with equation (5.4.3) 
we see that equation (5.7.2) differs only in the gain matrix D. (t+l;k+£). 
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The part depending only on the coordinating parameters re- 
mains the same. When 5c. (tit) - x. (tlk) is zero, then the coordinator 

— 1 — 1 

takes over the control completely and it is the same as that given 
by a separation theorem (recall the result of Section 4) . Thus as- 
ymptotically both open loop feedback optimal coordination and closed 
loop optimal coordination give the same results. 

Summarizing, closed loop optimal coordination differs from 
open loop feedback optimal coordination in two respects. 

(1) The lower level problems are easier to solve since the 
time interval under consideration is shorter. 

(2) The coordinator has to take into consideration the fact 
that he will be gathering information in the future. This 
results in a more sophisticated task of coordination. 

Apart from the usual coordinating parameter £_ which has 

to be transmitted and the state estimates, the coordinator 
has to give each lower level controller both K. . (k +L) 

—li 

and r. (k+£|k) . 
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8. Discussion and Perspectives 

In this chapter we have studied the on-line coordination of 
dynamic systems when the coordinator collects measurements from the 
lower level periodically. Two types of on-line coordination have 
been considered: open loop feedback optimal and closed loop optimal. 

Open loop feedback optimal coordination is conceptually simple and 
ignores the availability of future measurements to the coordinator. 
Essentially the coordinator and the lower level controllers have to 
solve an off-line coordination problem at each updating time. For 
the linear-quadratic-Gaussian case simple control strategies are 
obtained which have nice physical interpretations . The control 
strategy of each local controller has two parts : a part which depends 

on the difference between his local estimate and the coordinator's 
estimate about the state of his subsystem, and a part which depends 
on the information of the coordinator. 

Closed loop coordination assumes the availability of future 
measurements to the coordinator. In general a functional equation 
has to be solved by the coordinator. Even with the optimal coordi- 
nating parameters, the lower level problems are not uncoupled between 
updating times. In the linear-quadratic-Gaussian case, these lower 
level problems can be decomposed by the introduction of additional 
coordinating parameters. The resulting optimal control strategies 
are very similar to those obtained in open loop feedback optimal 
coordination. In fact, only the gain matrices multiplying the 
difference of the state estimates are different. 



CHAPTER 6 


CONCLUSIONS AND SUGGESTIONS FOR FUTURE RESEARCH 

In this thesis we have investigated ways of controlling a 
large-scale system in a decentralized manner. The two important 
aspects of computation and information are considered simultaneously. 
It is found that for systems with a particular structure , control 
strategies which utilize decentralized information can also be ob- 
tained by decentralized computation. 

In some problems, such as the static optimization problem .• 
considered in Chapter 2, the systems have this nice structure and 
control strategies which are computationally and informationally 
decentralized can be obtained right away. In some other problems, 
like those coupled dynamic systems that we considered in Chapter 3, 

4, and 5, this nice structure is not inherent. However, by con- 
sidering another kind of optimality, the problem can be reformulated 
to have that structure. It is then possible to identify two levels 
of information structure, one belonging to the lower level controllers 
and one belonging to the coordinator who sees that certain constraints 
are satisfied. The problem can be solved in a two-level scheme. 

The lower level problems are decomposed both informationally and com- 
putationally when optimal coordinating parameters are transmitted 
from the coordinator. The job of the coordinator is to find these 
optimal coordinating parameters. 

The appeal of this approach lies in the ease with which the 
optimal strategies can be found. A high dimensional stochastic 
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control problem is reduced to a number of lower dimensional problems. 
Even for nonlinear problems, if the solutions to the lower level 
problems are known, the solution for the entire problem can be con- 
structed. By specializing to the linear-quadratic -Gaussian case, 
we obtain control strategies which are physically intuitive. 

We have also allowed the situation when the coordinator makes 
measurements on-line. The control strategies obtained in Chapter 5 
for open loop feedback optimal coordination and closed loop optimal 
coordination provide alternative solutions to dynamic team problems 
when some shearing of past information is allowed. 

The whole area of research in the decentralized control of 
large-scale systems is still wide open. Some areas which arise im- 
mediately from this thesis are the following. In the static opti- 
mization of stochastic systems, more work should be done to relate 
the information available to the different decision agents and their 
subsystems such that decomposition is possible. Results for the 
existence of a decomposition and computational algorithms to arrive 
at the decomposition are also desirable. 

In dynamic systems, results are needed in comparing the kind 
of optimality defined in this thesis with the usual kind of stochastic 
optimality. Optimality seems to be related to how well the constraint 
is satisfied. Intuitively, we would expect a system to be more op- 
timal when the constraints are satisfied more exactly. It should be 
possible to define a degree of coordination based on this constraint 
such that optimality is related to the degree of coordination. 
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In this thesis, the coordinator has complete a priori infor- 
mation about the structure of the entire system. This leads to a 
rather complicated decision task for the coordinator. An area for 
future research will be the aggregation of a priori information and 
its relation to performance. 



APPENDIX A 


SOME RESULTS IN PROBABILITY THEORY 

In this appendix we summarize some definitions and results 
in probability theory which have been used in this thesis. The 
probability space under consideration is denoted by (X, 8, y) . 

F and F Q are sub-O-fields of 8. 

Def. A.l : F fi F Q is the smallest-0-f ield generated by A 0 B , 

where A e F and B £ F^. 

Lemma A.l : For any random variable Z, if e{£|F^} is measurable 

with respect to G, G c F^, then e{£|Fq} = e{£|G} a.e. 

Proof ; Given any random variable Z and a a-subfield G, the con- 
ditional expectation e{£|G} is characterized by two conditions: 

(a) It is measurable with respect to G; ' . 

(b) / A EU|G>d V = / A Z d V 

for every A e G (A.l) 

e{£|Fq} is measurable with respect to G. Moreover, 

f B Ktt|F 0 >d P = / B idP 

for every B e F Q (A. 2) 

Since G C F q , (A. 2) is also true for every B e G 
Thus e{£|Fq} satisfies equation (A.l), and e{£|Fq} = e{£|G} a.e. 

Q.E.D. 
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Lemma A. 2 : Let y be a F A F fl - measurable function from X into U. 

Let f be a measurable real-valued function on U x X. Then given 
any y e X, there exists a function Y(.;y) measurable with respect to 
F such that 

E{f (Y(x) ,x) | F 0 > (y) = E{f (Y(x;y) ,x) |F Q >(y) a.e. (A. 3) 

Proof : We assume two conditions, which, for this thesis, will be 

satisfied. 

O 

(1) There exists a regular conditional probability measure P^ (A) . 

(2) F and Fg are fields generated by functions h and hg so that 
Y being F 0 Fg - measurable is equivalent to 

Y(x) = r|(h(x) , hg(x)) (A. 4) 

where ry is A x A Q - measurable on Z x Zg 

h : X -*■ Z 

V x * z o 

A and Ag are a-fields on Z and Z Q 

Let Y(x;y) = tl(h(x) ,h Q (y) ) . Then given y, y(.ty) 
is F - measurable. 

E{f (Y(x) ,x) jFg}(y) * f x f (ri(h(x) ,hg(x)) ,x)d P y ° (x) 

* / A f (ry(h(x) ,h 0 (x) ) ,x)d P y ° (x) 

+/ x _ A f (y (h (x) ( h 0 (x)),x) d P y °(x) (A. 5) 

where 

A = (x;hg (x) = hg (y) } e F Q 


(A. 6) 
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Given A £ Fg, for all B e Fg (see Ref. {L4]) 

/_ P • (A) d U(y) = ti(AflB) - /„ l.(x)d U (x) (A. 7) 

d y d a 

Therefore for all A £ Fg, 

P y ° (A) = l A (y) for almost all y (A. 8) 

where 1. is the indicator funtion of A. 

A 

From equation (A. 6) , y £ A. Thus 

P y ° (A) = 1 for almost all y (A. 9) 

Equation (A. 5) then becomes 

E{f (y(x) ,x) |Fg}(y) = / A f (n(h(x) ,hg(x) ) ,x)d P y ° (x) 

= / x f(n(h(x) ,h Q (y)) ,x)d P y °(x) 

= E{f (Y(x;y) ,x) | Fg} (y) (A. 10) 

Q.E.D. 

Remark: If F = {x, $} then this result reduces to the usual identity 

E{f (Y(x) ,x) |Fg}(y) = E{f (Y(y) »x) |Fg}(y) (A. 11) 

For a discussion of substitution in conditional expectation, see [Bl] . 

Lemma A. 3 : Let f(u,v,y,z,x) be a function such that x,y,z are 

random variables. Suppose it is desired to choose u(y,z) and v(y) 

such that E{f (u(y,z) ,v(y) ,y,z,x) } is minimized. 

Let u°(y,z), v° (y) be the minimum of 

Min E{f (u,v(y) ,y,z,x) |y,z} 
u 

v(.) 
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Then 

Min E{f (u(y,z) ,v(y) ,y,z,x)} = E{f (u° (y,z) ,v° (y) ,y ,z ,x) } 

u(. , .) 

v(.) 

= E{Min E{f (u,v(y) ,y ,z,x) |y,z}} 
u 

v ( . ) (A. 12) 

Proof : 

E{f (u°(y,z) ,v°(y) ,y,z,x ) |y,z} <_ E{f(u(y,z) ,v(y) ,y,z,x) |y,z} 

for all u(.,.),v(.) 

(A. 13) 

Thus 

E{f (u°(y,z) ,v°(y) ,y,z,x)} ■ E{E{f (u° (y r z) ,v° (y) ,y,z # x) |y,z}} 
<_ E{f (u(y,z) ,v(y) ,y,z,x) } for all u(.,.),v(.) 

(A. 14) 
or 

E{f (u° (y,z) ,v° (y) ,y,z,x) } £ Min E{f (u(y,z) ,v(y) ,y,z,x) } 

u(. ».) 
v(.) 

(A. 15) 


But 

Min . E{f (u(y,z) ,v(y) ,y,z,x) } <_ E{f (u° (y ,z) ,v° (y) ,y,z,x) } 
u(. ,.) 

(A. 16) 


Hence we obtain equation (A. 12) 


Q.E.D. 



APPENDIX B 


INVERTIBILITY OF K. (k) AND VERIFICATION OF EQUATION (4.4.32) 


1 . Invertibility of K^(k) 

K. (k) =0. + A. . 'S. (k+1) A. . K. (T) = F. >0 

— i —ii — i —li —i —l — 


(B.l) 


where 


- 1 , 


(k+1) = (k+1) - K. fk+DB.T. ■‘■(k+l)B i , K i (k+l) 


(B.2) 


If K. (k+1) > 0, then K." 1 (k+1) exists 


—1 


(k+1 ) = [K^ _1 (k+1) + Bj R. ”^Bj ' J _1 


A. . 'S. (k+1) A. . > 0 

—li —l —li 


(B.3) 

(B.4) 


If > 2.' t ^ en K^(k) > 2. 

Therefore by induction (k) is. invertible. 

Remark: 2i > 2. sufficient but not necessary. If A^ > £ 

then K. (k) > 0 


2. Verification of Equation (4.4.32) 

T. _1 (k+1) B . ' (I-K. (k+1) B . T . _1 (k+1) B . 'j” 1 
— 1 — i — — 1 — i — i “i 


= T i " 1 (k+l)B i 'K i (k+l)S i " 1 (k+l) 


= T. _1 (k+l)B . 'K. (k+1) [K.^fk+l) + B.R. _1 B. '] 
— i —i—i — i —i—i -a 


= T.” 1 (k+1) [R. + B. 'K. (k+l)B . ]R. _1 B . ' 

-1 — 1 —1 “I — 1—1 —1 


= R. - 1 B. ' 


(B.5) 
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APPENDIX C ' 


VERIFICATION OF EQUATION (5.7.19) 

T-i 

I tr 2.- 5.- ( k l k ) + tr K. (k+1) [Z. (k+l|k) - Z. (k+l|k+l)] 

k=0 1 1 i -x -1 

+ tr (T)Z^ (T | T) 

T-l 

I tr 2.ili< k l k ) + ** ^(k+1) [A i± E i (k | k) A ±i ' - Z^k+ljk+1) 
k=0 

+ (k) 3 + tr (t|t) 

T-l 

l tr(£. + A 'K. (k+l)A. .)Z. (k[k) + tr K. (k+l)H. (k) 

" A 1 —11 —1 —11 —1 —1 —1 
k=0 

- tr (k+1 ) Z^ (k+1 1 k+1) + tr K^TJZ^tJt) 

+ tr (0) (0 1 0) - tr (0) (o| 0) 

T-l 

tr K i (0)Z i (0|0)+ l tr^ + A^’K^k+ljA^ - ^(k) ^(kfk) 
k=0 

+ tr K i (k+l)H i (k) (C.l) 

c ± (0) 

From equation 4.5.6, 


l s. (0) = x' (0)A’S(1)A x(0) + c (0) + l - r’ (k)K (k)£K -1 (k)r(k) 
i=l 1 k=l 

+ X*’ (k)K _1 (k)£ K _1 (k)r(k) - | X* * (k) [K _1 (k)£ K -1 (k) 

+ B R“ 1 B']X*(k) - | X* , (T)§" 1 (T)X*(T) - r ' (T)fT 1 (T) r (T) 

+ X*’ (T)K _1 (T)r(T) (C.2) 
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From equation (4.4.37)^ 

r(k) = - K(k)x(k) + | X*(k) k=l,...,T-l (C.3) 

From equation (5.6.16) 

r(T) = “ 5 i* (T > K(T)K _1 (T)\*(T) (C .4) 

Thus , 

N 

l s . (0) + r' (T)K (T)[r(T) - \ X* (T) ] 
i«l * 




m t — is: 


Therefore 


_ 1 


s i (0) 


r' (T)K _1 (T) [r (T) 


- | X* (T) ] 


= c(0) + x* (0)K(0)x(0) - 

+ u*' (k)R u* (k) + 
= c (0) + x' (0)K(0)x(0) - 


T-l _ 

l X' (k)£ x.(k) 
k=0 

x' (T)K(T)x(T) 
x' (0)K(0)x(0) 



APPENDIX D 


VERIFICATION OF EQUATION (5.6.22) 

u* (k) = -\ R -1 B , £*(k;k) (D.l) 

From (5.4.9) 

£*(k;k) = - 2 K(k+l)x(k+l|k) 

= - 2 K(k+1) [A x(k|k) + B u*(k)] (D.2) 

Therefore 

R u*(k) = B^'K(k+l) [A x(k|k) + u* (k) ] — - (D.3) 

or 

u* (k) = T -1 (k+l)B'K(k+l)A x(k|k) (D.4) 

Q.E.D. 
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